학술논문

Predicting COVID-19 Severity Using a Cut-and-Solve Feature Selection Approach
Document Type
Conference
Source
2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) Bioinformatics and Biomedicine (BIBM), 2023 IEEE International Conference on. :3370-3375 Dec, 2023
Subject
Bioengineering
Computing and Processing
Engineering Profession
Robotics and Control Systems
Signal Processing and Analysis
Proteins
COVID-19
Analytical models
Logistic regression
Biological system modeling
Proteomics
Predictive models
Feature Selection
Mixed Integer Programming
Language
ISSN
2156-1133
Abstract
Individuals with coronavirus disease 2019 (COVID-19) infection present in a variety of ways, ranging from asymptomatic or mild cough, to organ failure or death. One of the major challenges for the medical community is the quick and accurate determination of how COVID-19 will progress in an individual. Herein, we introduce a new Cut-and-Solve based feature selection program for identifying predictive feature sets in heterogeneous data. We analyze proteomics data from Washington University to identify models ranging in size from a single feature up to five. Validation of logistic regression models using area under the curve (AUC) were applied for both a holdout data set and an independent data set from Massachusetts General Hospital. A variety of known and novel biomarkers for COVID-19 severity were identified. The best model for predicting severe (ventilation or death) vs. non-severe infection is achieved for CALCOCO2 and STC1, with an average AUC=0.81. Based on the known severity markers, several different proteomic pathways are identified. Enrichment analysis indicates activity associated with inflammatory response, as well as myelination and cardiac function.