학술논문

Exploring the impact of missingness on racial disparities in predictive performance of a machine learning model for emergency department triage
Research and Applications
Document Type
Academic Journal
Source
JAMIA Open. December 2023, Vol. 6 Issue 4
Subject
Comparative analysis
Electronic records -- Comparative analysis
Medical records -- Comparative analysis
Hospital emergency services -- Comparative analysis
Machine learning -- Comparative analysis
Medical research -- Comparative analysis
Medicine, Experimental -- Comparative analysis
Hospitals -- Emergency service
Language
English
ISSN
2574-2531
Abstract
Background and significance Racism, a broad social system that assigns and ranks people in socially/politically invented racial groups and underpins their differential treatment, (1) may influence clinical decision-making technology in [...]
Objective: To investigate how missing data in the patient problem list may impact racial disparities in the predictive performance of a machine learning (ML) model for emergency department (ED) triage. Materials and Methods: Racial disparities may exist in the missingness of EHR data (eg, systematic differences in access, testing, and/or treatment) that can impact model predictions across racialized patient groups. We use an ML model that predicts patients' risk for adverse events to produce triage-level recommendations, patterned after a clinical decision support tool deployed at multiple EDs. We compared the model's predictive performance on sets of observed (problem list data at the point of triage) versus manipulated (updated to the more complete problem list at the end of the encounter) test data. These differences were compared between Black and non-Hispanic White patient groups using multiple performance measures relevant to health equity. Results: There were modest, but significant, changes in predictive performance comparing the observed to manipulated models across both Black and non-Hispanic White patient groups; c-statistic improvement ranged between 0.027 and 0.058. The manipulation produced no between-group differences in c-statistic by race. However, there were small between-group differences in other performance measures, with greater change for non-Hispanic White patients. Discussion: Problem list missingness impacted model performance for both patient groups, with marginal differences detected by race. Conclusion: Further exploration is needed to examine how missingness may contribute to racial disparities in clinical model predictions across settings. The novel manipulation method demonstrated may aid future research. Lay Summary Machine learning (ML) can be used to leverage existing clinical data--like in the electronic health record (EHR)--to predict future events. ML algorithms are developed and trained using data collected and stored during prior healthcare encounters. Thus, they are prone to bias that exists within these datasets, including bias that drives more reliable predictions for one racialized group than another. A critical source of potential bias is missing data. EHR data are often incomplete; when more data are missing in more significant ways for one group than another, this can result in less reliable predictions for that group. In this study, we developed and tested a method for measuring the impact of missing data on ML prediction reliability. We used this method to measure effects of missing medical problem information on the accuracy of ML predictions used to guide an emergency department triage decision support tool, and compared these effects across racialized groups. Missing medical problem data had a small effect on prediction accuracy across all racialized groups and in this setting, impacted predictions for non- Hispanic White patients slightly more than Black patients. The method we describe here is useful for future studies that interrogate bias from missing data. Key words: decision support systems; clinical; health equity; triage.