학술논문
Multilabel Aggressive Text Classification from Social Media using Transformer-based Approaches
Document Type
Conference
Source
2023 26th International Conference on Computer and Information Technology (ICCIT) Computer and Information Technology (ICCIT), 2023 26th International Conference on. :1-6 Dec, 2023
Subject
Language
Abstract
The prevalence of multilabel aggressive text content on social media has a detrimental societal impact attracting the attention of government agencies and tech corporations to undertake measures against the spread of it. Hitherto research has focused on high-resource languages like English, leaving low-resource languages like Bengali out of the spotlight. This work presents a transformer-based technique to classify multilabel aggressive texts in Bengali into their targets to aid research in this area. A dataset (EM-BAD) containing 13728 texts is developed into five target classes: Religious Aggression (ReAG), Political Aggression (PoAG), Verbal Aggression (VeAG), Gender Aggression (GeAG), and Racial Aggression (RaAG) to perform the aggressive texts classification. Experimental results demonstrate that the Bangla-BERT with adjusted pooling layer and fine-tuning outdoes all ML, DL, and transformer-base baselines and existing techniques. The Bangla-BERT shows the highest weighted f1-score of 0.89 in the multilabel aggressive text classification task.