학술논문

On the Effectiveness of Features for Predicting User Churn in Reddit Communities
Document Type
Conference
Source
2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC) COMPSAC Computers, Software, and Applications Conference (COMPSAC), 2023 IEEE 47th Annual. :1173-1178 Jun, 2023
Subject
Computing and Processing
Engineering Profession
General Topics for Engineers
Social networking (online)
Computational modeling
Predictive models
Feature extraction
Software
Data mining
Online community
Churn prediction
Reddit
Language
Abstract
Predicting which users are likely to leave a given online community has been studied for several types of platforms. The problem of predicting whether a given user is going to leave a given community is referred to as the user churn prediction problem. Existing studies have typically used features obtained from the user’s activity records in the community. In contrast, we use both social-network and inter-community features, as well as the basic features used in existing studies. In this paper, we focus on Reddit as an example of a popular online community platform. We extract several features from records of comments in Reddit communities, then use them to construct models for user churn prediction and to evaluate their prediction accuracy. Our results show that among the several features used in this paper, the number of comments posted by users is most effective at predicting user churn, and the model using only the number of posted comments achieved an F-1 score of 0.78. Although social-network and inter-community features can be used for user churn prediction, combining them with basic activity features does not improve prediction accuracy.