학술논문

A Weighted Stacking Ensemble Model With Sampling for Fake Reviews Detection
Document Type
Periodical
Source
IEEE Transactions on Computational Social Systems IEEE Trans. Comput. Soc. Syst. Computational Social Systems, IEEE Transactions on. 11(2):2578-2594 Apr, 2024
Subject
Computing and Processing
Communication, Networking and Broadcast Technologies
General Topics for Engineers
Feature extraction
Hidden Markov models
Data models
Computational modeling
Support vector machines
Machine learning
Convolutional neural networks
Ensemble learning
fake review
machine learning
sampling techniques
Language
ISSN
2329-924X
2373-7476
Abstract
Customers use reviews as a primary source of information to judge a product or service. Positive reviews help boost companies’ reputations, increasing their revenue by attracting new clients, and increasing the purchasing order size. On the other hand, negative reviews significantly reduce sales, which might be the case due to competitive advantage. Organizations can use fake (i.e., misleading or fraudulent) reviews to generate fast profits by deceiving customers into buying their products. Recently, various methods to assess the legitimacy of reviews have been introduced using advances in machine learning. However, existing methods fall short of achieving highly accurate detection results for unbalanced classes. We aimed to create a spam review identification model using ensemble-based learning while balancing classes using sampling techniques. This article proposes a weighted stacking ensemble model with sampling (WSEM-S) for efficient fake reviews detection. We used $n$ -gram models to effectively model language data for feature retrieval. The experimental results on three customer reviews datasets: YELPNYC, Deceptive Opinion Spam Corpus (DOSC) v1.4, and Deception datasets show that the proposed model outperforms the conventional machine learning techniques [Naïve Bayes, logistic regression, K-nearest neighbor (KNN), random forest, extreme gradient boosting (XGBoost), and convolutional neural network (CNN)] as well as the state-of-the-art ensemble models.