학술논문

Detecting Arabic Spam Reviews in Social Networks Based on Classification Algorithms
Document Type
Academic Journal
Source
Transactions on Asian and Low-Resource Language Information Processing. 21(1):1-13
Subject
Arabic language
Facebook
spam detection
social networks
classification algorithms
Language
English
ISSN
2375-4699
2375-4702
Abstract
Reviews or comments that users leave on social media have great importance for companies and business entities. New product ideas can be evaluated based on customer reactions. However, this use of social media is complicated by those who post spam on social media in the form of reviews and comments. Designing methodologies to automatically detect and block social media spam is complicated by the fact that spammers continuously develop new ways to leave their spam comments. Researchers have proposed several methods to detect English spam reviews. However, few studies have been conducted to detect Arabic spam reviews. This article proposes a keyword-based method for detecting Arabic spam reviews. Keywords or Features are subsets of words from the original text that are labelled as important. A term's weight, Term Frequency–Inverse Document Frequency (TF-IDF) matrix, and filter methods (such as information gain, chi-squared, deviation, correlation, and uncertainty) have been used to extract keywords from Arabic text. The method proposed in this article detects Arabic spam in Facebook comments. The dataset consists of 3,000 Arabic comments extracted from Facebook pages. Four different machine learning algorithms are used in the detection process, including C4.5, kNN, SVM, and Naïve Bayes classifiers. The results show that the Decision Tree classifier outperforms the other classification algorithms, with a detection accuracy of 92.63%.