학술논문

TweetDrought: A Deep-Learning Drought Impacts Recognizer based on Twitter Data
Document Type
Working Paper
Source
Subject
Computer Science - Computation and Language
Computer Science - Machine Learning
Language
Abstract
Acquiring a better understanding of drought impacts becomes increasingly vital under a warming climate. Traditional drought indices describe mainly biophysical variables and not impacts on social, economic, and environmental systems. We utilized natural language processing and bidirectional encoder representation from Transformers (BERT) based transfer learning to fine-tune the model on the data from the news-based Drought Impact Report (DIR) and then apply it to recognize seven types of drought impacts based on the filtered Twitter data from the United States. Our model achieved a satisfying macro-F1 score of 0.89 on the DIR test set. The model was then applied to California tweets and validated with keyword-based labels. The macro-F1 score was 0.58. However, due to the limitation of keywords, we also spot-checked tweets with controversial labels. 83.5% of BERT labels were correct compared to the keyword labels. Overall, the fine-tuned BERT-based recognizer provided proper predictions and valuable information on drought impacts. The interpretation and analysis of the model were consistent with experiential domain expertise.
Comment: 5 pages (+3 in appendix), 5 figures in appendix, 2 tables (+1 in appendix), ICML Workshop on Tackling Climate Change with Machine Learning Workshop, 2021