학술논문

Avoiding the Drunkard's search: Investigating collection strategies for building a Twitter dataset
Document Type
Conference
Source
2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL) Digital Libraries (JCDL), 2016 IEEE/ACM Joint Conference on. :205-206 Jun, 2016
Subject
Computing and Processing
Twitter
Tagging
Time-frequency analysis
Data mining
Media
Data collection
Buildings
Data Analytics
Social Media Analysis
Data Selection
Language
Abstract
We investigate methods for collecting data to form an archive on the debate within Twitter surrounding the UK's inclusion in the EU. We use three strategies, gathering data using hashtags, extracting data from the random stream and collecting from users known to be discussing the debate. We explore the various bias in the resulting datasets.