학술논문

Dengue Fever: From Extreme Climates to Outbreak Prediction
Document Type
Conference
Source
2022 IEEE International Conference on Data Mining (ICDM) ICDM Data Mining (ICDM), 2022 IEEE International Conference on. :1083-1088 Nov, 2022
Subject
Computing and Processing
Climate change
Diseases
Epidemics
Ensemble learning
Predictive models
Forecasting
Dengue Fever prediction
Epidemic forecast
Outbreak prediction
Language
ISSN
2374-8486
Abstract
Dengue Fever (DF) is an emerging mosquito-borne infectious disease that affects hundred of millions of people each year with considerable morbidity and mortality rates, especially for children. Together with global climate changes, it is continuously increasing in terms of number of cases and new locations. Thus, having effective early warning systems becomes an urgent need to improve disease controls and prevention. In this paper, we introduce a novel framework, called Proximity Time Ensemble, to predict DF outbreaks for multiple areas (provinces) and multiple time steps ahead, and to study the effects of climate data on DF outbreaks. PT-Ensem consists of 6 key components: (1) an event-to-event probabilistic framework to study links among extreme climate events and DF outbreaks; (2) a proximity graph that connects similar provinces; (3) an ensemble prediction technique that combines many different advanced machine learning (ML) methods to predict outbreaks within t time steps in the future using extreme climate events as model inputs; (4) a data aggregate scheme to enrich training data for each province via its neighbors in the proximity graph; (5) a proximity propagation step that propagates predicted results among similar provinces via the proximity graph until maximal agreements are reached among provinces; and (6) a time propagation step to propagate results via different predicted time steps in each province. We use PT-Ensem to predict DF outbreaks for all provinces in Vietnam using data collected from 1997-2016. Experiments show that PT-Ensem acquires significant performance boost compared to many highly-rated ML models like XGBoost, LightGBM and Catboost in the outbreak prediction task. Compared to most recent deep learning approaches like LSTM-ATT, LSTM, CNN and Transformer for predicting DF incidence, PT-Ensem also dominates in both prediction accuracy and computation times.