학술논문

Offensive Language Detection on Social Media Based on Text Classification
Document Type
Conference
Source
2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC) Computing and Communication Workshop and Conference (CCWC), 2022 IEEE 12th Annual. :0092-0098 Jan, 2022
Subject
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
General Topics for Engineers
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Training
Support vector machines
Social networking (online)
Conferences
Text categorization
Pipelines
Blogs
offensive language detection
social media
machine learning
text mining
Language
Abstract
There is a concerning rise of offensive language on the content generated by the crowd over various social platforms. Such language might bully or hurt the feelings of an individual or a community. Recently, the research community has investigated and developed different supervised approaches and training datasets to detect or prevent offensive monologues or dialogues automatically. In this study, we propose a model for text classification consisting of modular cleaning phase and tokenizer, three embedding methods, and eight classifiers. Our experiments shows a promising result for detection of offensive language on our dataset obtained from Twitter. Considering hyperparameter optimization, three methods of AdaBoost, SVM and MLP had highest average of F1-score on popular embedding method of TF-IDF.