학술논문

Improving Generalizability to Out-of-Distribution Data in Radiogenomic Models to Predict IDH Mutation Status in Glioma Patients
Document Type
Conference
Source
2022 IEEE International Symposium on Medical Measurements and Applications (MeMeA) Medical Measurements and Applications (MeMeA), 2022 IEEE International Symposium on. :1-5 Jun, 2022
Subject
Engineering Profession
Biological system modeling
Magnetic resonance imaging
Genomics
Predictive models
Feature extraction
Data models
Robustness
radiomics
radiogenomics
feature selection
generalizability
robustness
Language
Abstract
Radiogenomics offers a potential virtual and non-invasive biopsy, being very promising in cases where genomic testing is not available or possible. However, radiogenomics mod-els often lack generalizability, where a performance degradation on unseen data caused by differences in the MRI sequence parameters, MRI manufacturers, and scanners make this issue worse. Therefore, selecting the radiomic features to be included in the model is of paramount importance, as a proper feature selection may lead to robustness and generalizability of the models in unseen data. This study developed and assessed a novel unsupervised, yet biological-based, feature selection method capable of improving the performance of radiogenomic models in unseen data. We assessed 63 low-grade gliomas and glioblastomas multiform patients acquired in 4 different institutions/centers and publicly available in The Cancer Genome Archive (TCGA) and The Cancer Imaging Archive (TCIA). Radiomics features were extracted from multiparametric MRI images (pre-contrast T1-weighted - T1w, post-contrast T1-weighted - cT1w, T2-weighted - T2w, and FLAIR) and different regions-of-interest (enhancing tumor, non-enhancing tumor/necrosis, and edema). The proposed method was compared with an embedded feature selection approach commonly used in radiomics/radiogenomics studies by leaving data from a center as an independent held-out test set and tuning the model with the data from the remaining centers. The performances of the proposed method was consistently better in all test sets showing that it improves robustness and generalizability to out-of-distribution data.