학술논문

Machine Learning Approach for Text Classification in Cybercrime
Document Type
Conference
Source
2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA) Computing Communication Control and Automation (ICCUBEA), 2018 Fourth International Conference on. :1-6 Aug, 2018
Subject
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Robotics and Control Systems
Signal Processing and Analysis
Facebook
Twitter
Machine learning
Computer crime
Training
Text categorization
Data extraction
Naive bayes
Scikit-learn
NLTK
Sentiment analysis
Language
Abstract
Nowadays, use of machine learning is increasing rapidly in every field and plays a key role in sentiment classification. In this project, two-training datasets are used, one is an online training dataset which is available online and another contains pure cybercrime data extracted from Facebook and Twitter using Facepager software tool. The aim is to extract the cybercrime data and according to supervised machine learning, separate the data in two labelled class (i.e. positive and negative) and pre-processing it to get a clean training dataset. The goal is to use cybercrime data to achieve classifier accuracy percent and text classification with confidence value that will be achieved by using NLTK and Scikit-learn. The results achieved using both datasets show that using cybercrime datasets gives better classifier accuracy percent.