학술논문

Combat COVID-19 at National Level using Risk Stratification with Appropriate Intervention
Document Type
Conference
Source
2023 IEEE International Conference on Big Data (BigData) Big Data (BigData), 2023 IEEE International Conference on. :4922-4930 Dec, 2023
Subject
Bioengineering
Computing and Processing
Geoscience
Robotics and Control Systems
Signal Processing and Analysis
COVID-19
Hospitals
Sociology
Machine learning
Big Data
Predictive models
Sensitivity and specificity
predictive analytics
public health
decision-support
risk stratification
COVID-19 pandemic
Language
Abstract
In the national battle against COVID-19, harnessing population-level big data is imperative, enabling authorities to devise effective care policies, allocate healthcare resources efficiently, and enact targeted interventions. Singapore adopted the Home Recovery Programme (HRP) in September 2021, diverting low-risk COVID-19 patients to home care to ease hospital burdens amid high vaccination rates and mild symptoms. While a patient’s suitability for HRP could be assessed using broad-based criteria, integrating machine learning (ML) model becomes invaluable for identifying high-risk patients prone to severe illness, facilitating early medical assessment. Most prior studies have traditionally depended on clinical and laboratory data, necessitating initial clinic or hospital evaluations. None of these studies incorporated vaccination status, a crucial variable in a well-vaccinated population. This paper proposes a machine learning approach to nationwide risk stratification, offering intervention recommendations by harnessing nationwide datasets. Our best-performing ML model, XGBoost achieves an AUROC of 0.930 utilizing data from multiple data sources including patients’ demographic information, vaccination status and medical history. For broader applicability, we also propose a parsimonious XGBoost model with an AUROC of 0.885 with a selection of five commonly collected variables, namely age, number of vaccine doses taken and number of days since the first, second and booster doses. Importantly, both of our proposed models achieve robust predictive performance without requiring the collection of clinical or laboratory data from patients. We believe that the parsimonious model, leveraging easily attainable data, has the potential for broader adoption across diverse nations, ultimately delivering paramount value to their populations.