학술논문

Machine Learning as a Tool for Early Detection: A Focus on Late-Stage Colorectal Cancer across Socioeconomic Spectrums.
Document Type
Article
Source
Cancers. Feb2024, Vol. 16 Issue 3, p540. 21p.
Subject
*SOCIAL determinants of health
*MACHINE learning
*EARLY detection of cancer
*ACQUISITION of data
*ARTIFICIAL intelligence
*COLORECTAL cancer
*TUMOR classification
*CONCEPTUAL structures
*T-test (Statistics)
*COMPARATIVE studies
*SOCIOECONOMIC disparities in health
*DESCRIPTIVE statistics
*CHI-squared test
*PREDICTION models
*LITERATURE reviews
*CANCER patient medical care
*EPIDEMIOLOGICAL research
Language
ISSN
2072-6694
Abstract
Simple Summary: This research explores the potential of machine learning (ML) to predict late-stage colorectal cancer (CRC) diagnoses. The focus is on understanding how socioeconomic and regional factors affect cancer care, particularly in detecting CRC at an advanced stage. We aim to merge data on social determinants of health with individual demographics to uncover patterns indicating higher CRC risk. We compared various ML models, such as decision trees, random forest, and gradient boosting to find the most effective tool for this task. The goal is to utilize artificial intelligence (AI) for early, more accurate CRC detection, which can lead to better treatment outcomes. This study promises to significantly contribute to cancer research, potentially leading to more personalized and efficient healthcare strategies that could ultimately save lives. Purpose: To assess the efficacy of various machine learning (ML) algorithms in predicting late-stage colorectal cancer (CRC) diagnoses against the backdrop of socio-economic and regional healthcare disparities. Methods: An innovative theoretical framework was developed to integrate individual- and census tract-level social determinants of health (SDOH) with sociodemographic factors. A comparative analysis of the ML models was conducted using key performance metrics such as AUC-ROC to evaluate their predictive accuracy. Spatio-temporal analysis was used to identify disparities in late-stage CRC diagnosis probabilities. Results: Gradient boosting emerged as the superior model, with the top predictors for late-stage CRC diagnosis being anatomic site, year of diagnosis, age, proximity to superfund sites, and primary payer. Spatio-temporal clusters highlighted geographic areas with a statistically significant high probability of late-stage diagnoses, emphasizing the need for targeted healthcare interventions. Conclusions: This research underlines the potential of ML in enhancing the prognostic predictions in oncology, particularly in CRC. The gradient boosting model, with its robust performance, holds promise for deployment in healthcare systems to aid early detection and formulate localized cancer prevention strategies. The study's methodology demonstrates a significant step toward utilizing AI in public health to mitigate disparities and improve cancer care outcomes. [ABSTRACT FROM AUTHOR]