학술논문
A Weighted Stacking Ensemble Model With Sampling for Fake Reviews Detection
Document Type
Periodical
Author
Source
IEEE Transactions on Computational Social Systems IEEE Trans. Comput. Soc. Syst. Computational Social Systems, IEEE Transactions on. 11(2):2578-2594 Apr, 2024
Subject
Language
ISSN
2329-924X
2373-7476
2373-7476
Abstract
Customers use reviews as a primary source of information to judge a product or service. Positive reviews help boost companies’ reputations, increasing their revenue by attracting new clients, and increasing the purchasing order size. On the other hand, negative reviews significantly reduce sales, which might be the case due to competitive advantage. Organizations can use fake (i.e., misleading or fraudulent) reviews to generate fast profits by deceiving customers into buying their products. Recently, various methods to assess the legitimacy of reviews have been introduced using advances in machine learning. However, existing methods fall short of achieving highly accurate detection results for unbalanced classes. We aimed to create a spam review identification model using ensemble-based learning while balancing classes using sampling techniques. This article proposes a weighted stacking ensemble model with sampling (WSEM-S) for efficient fake reviews detection. We used $n$ -gram models to effectively model language data for feature retrieval. The experimental results on three customer reviews datasets: YELPNYC, Deceptive Opinion Spam Corpus (DOSC) v1.4, and Deception datasets show that the proposed model outperforms the conventional machine learning techniques [Naïve Bayes, logistic regression, K-nearest neighbor (KNN), random forest, extreme gradient boosting (XGBoost), and convolutional neural network (CNN)] as well as the state-of-the-art ensemble models.