학술논문
An automated machine learning framework to optimize radiomics model construction validated on twelve clinical applications
Document Type
Working Paper
Author
Starmans, Martijn P. A.; van der Voort, Sebastian R.; Phil, Thomas; Timbergen, Milea J. M.; Vos, Melissa; Padmos, Guillaume A.; Kessels, Wouter; Hanff, David; Grunhagen, Dirk J.; Verhoef, Cornelis; Sleijfer, Stefan; Bent, Martin J. van den; Smits, Marion; Dwarkasing, Roy S.; Els, Christopher J.; Fiduzi, Federico; van Leenders, Geert J. L. H.; Blazevic, Anela; Hofland, Johannes; Brabander, Tessa; van Gils, Renza A. H.; Franssen, Gaston J. H.; Feelders, Richard A.; de Herder, Wouter W.; Buisman, Florian E.; Willemssen, Francois E. J. A.; Koerkamp, Bas Groot; Angus, Lindsay; van der Veldt, Astrid A. M.; Rajicic, Ana; Odink, Arlette E.; Deen, Mitchell; T., Jose M. Castillo; Veenland, Jifke; Schoots, Ivo; Renckens, Michel; Doukas, Michail; de Man, Rob A.; IJzermans, Jan N. M.; Miclea, Razvan L.; Vermeulen, Peter B.; Bron, Esther E.; Thomeer, Maarten G.; Visser, Jacob J.; Niessen, Wiro J.; Klein, Stefan
Source
Subject
Language
Abstract
Predicting clinical outcomes from medical images using quantitative features (``radiomics'') requires many method design choices, Currently, in new clinical applications, finding the optimal radiomics method out of the wide range of methods relies on a manual, heuristic trial-and-error process. We introduce a novel automated framework that optimizes radiomics workflow construction per application by standardizing the radiomics workflow in modular components, including a large collection of algorithms for each component, and formulating a combined algorithm selection and hyperparameter optimization problem. To solve it, we employ automated machine learning through two strategies (random search and Bayesian optimization) and three ensembling approaches. Results show that a medium-sized random search and straight-forward ensembling perform similar to more advanced methods while being more efficient. Validated across twelve clinical applications, our approach outperforms both a radiomics baseline and human experts. Concluding, our framework improves and streamlines radiomics research by fully automatically optimizing radiomics workflow construction. To facilitate reproducibility, we publicly release six datasets, software of the method, and code to reproduce this study.
Comment: 22 pages, 3 figures, 2 tables, 1 algorithm, 3 supplementary figures, 4 supplementary tables, 1 supplementary algorithm
Comment: 22 pages, 3 figures, 2 tables, 1 algorithm, 3 supplementary figures, 4 supplementary tables, 1 supplementary algorithm