학술논문

Two-stage Clustering Method for Discovering People's Perceptions: A Case Study of the COVID-19 Vaccine from Twitter
Document Type
Conference
Source
2021 IEEE International Conference on Big Data (Big Data) Big Data (Big Data), 2021 IEEE International Conference on. :614-621 Dec, 2021
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
COVID-19
Time-frequency analysis
Systematics
Social networking (online)
Clustering methods
Scalability
Blogs
social media analysis
knowledge discovery
graph mining
clustering
time series
Language
Abstract
Twitter is currently one of the most influential microblogging services on which users interact with messages. It is imperative to grasp the big picture of Twitter through analyzing its huge stream data. In this study, we develop a two-stage clustering method that automatically discovers coarse-grained topics from Twitter data. In the first stage, we use graph clustering to extract micro-clusters from the word co-occurrence graph. All the tweets in a micro-cluster share a fine-grained topic. We then obtain the time series of each micro-cluster by counting the number of tweets posted in a time window. In the second stage, we use time series clustering to identify the clusters corresponding to coarse-grained topics. We evaluate the computational efficacy of the proposed method and demonstrate its systematic improvement in scalability as the data volume increases. Next, we apply the proposed method to large-scale Twitter data (26 million tweets) about the COVID-19 Vaccination in Japan. The proposed method separately identifies the reactions to news and the reactions to tweets.