학술논문

WeChat Toxic Article Detection: A Data-Driven Machine Learning Approach
Document Type
Conference
Source
2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2018. :916-921 Nov, 2018
Subject
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Signal Processing and Analysis
Feature extraction
Analytical models
Twitter
Machine learning
Distribution functions
Privacy
Language
ISSN
2640-0103
Abstract
Recently, toxic information detection has attracted tremendous amounts of research interest because of the popularity of social networks and the widespread of toxic information which may have dire consequences to the public. Existing work extensively studies toxic article detection in open social networks from information diffusion perspective. However, in closed social networks as exemplified by WeChat Moments (WM), the diffusion process is uneasily visible. To tackle the toxic article detection problem in closed social networks, in this paper we empirically study the articles spread in WM which is based on the largest Chinese social platform WeChat. In particular, we systematically analyze users' behavior and text information of normal and toxic articles and identify a striking difference between them. Furthermore, we design a new model named MAT-LSTM which can well capture the impact of different kinds of text information. To improve the performance of automatic toxic article detection, we propose XMATL framework which is enhanced from MAT-LSTM and can utilize text information and users' behavior characteristics in a holistic manner. We conduct extensive experiments using two real-world datasets and demonstrate that our proposed model can effectively detect toxic articles in WM and achieve outstanding performance gain over the classic methods.