KOR

e-Article

Toward Foundational Deep Learning Models for Medical Imaging in the New Era of Transformer Networks.
Document Type
Academic Journal
Author
Willemink MJ; Department of Radiology, Center for Academic Medicine, Stanford University School of Medicine, 453 Quarry Rd, 324-7, Palo Alto, CA 94304 (M.J.W., V.S.); Segmed, Menlo Park, Calif (M.J.W.); and NVIDIA, Santa Clara, Calif (H.R.R.).; Roth HR; Department of Radiology, Center for Academic Medicine, Stanford University School of Medicine, 453 Quarry Rd, 324-7, Palo Alto, CA 94304 (M.J.W., V.S.); Segmed, Menlo Park, Calif (M.J.W.); and NVIDIA, Santa Clara, Calif (H.R.R.).; Sandfort V; Department of Radiology, Center for Academic Medicine, Stanford University School of Medicine, 453 Quarry Rd, 324-7, Palo Alto, CA 94304 (M.J.W., V.S.); Segmed, Menlo Park, Calif (M.J.W.); and NVIDIA, Santa Clara, Calif (H.R.R.).
Source
Publisher: Radiological Society of North America, Inc Country of Publication: United States NLM ID: 101746556 Publication Model: eCollection Cited Medium: Internet ISSN: 2638-6100 (Electronic) Linking ISSN: 26386100 NLM ISO Abbreviation: Radiol Artif Intell Subsets: PubMed not MEDLINE
Subject
Language
English
Abstract
Deep learning models are currently the cornerstone of artificial intelligence in medical imaging. While progress is still being made, the generic technological core of convolutional neural networks (CNNs) has had only modest innovations over the last several years, if at all. There is thus a need for improvement. More recently, transformer networks have emerged that replace convolutions with a complex attention mechanism, and they have already matched or exceeded the performance of CNNs in many tasks. Transformers need very large amounts of training data, even more than CNNs, but obtaining well-curated labeled data is expensive and difficult. A possible solution to this issue would be transfer learning with pretraining on a self-supervised task using very large amounts of unlabeled medical data. This pretrained network could then be fine-tuned on specific medical imaging tasks with relatively modest data requirements. The authors believe that the availability of a large-scale, three-dimension-capable, and extensively pretrained transformer model would be highly beneficial to the medical imaging and research community. In this article, authors discuss the challenges and obstacles of training a very large medical imaging transformer, including data needs, biases, training tasks, network architecture, privacy concerns, and computational requirements. The obstacles are substantial but not insurmountable for resourceful collaborative teams that may include academia and information technology industry partners. © RSNA, 2022 Keywords: Computer-aided Diagnosis (CAD), Informatics, Transfer Learning, Convolutional Neural Network (CNN).
Competing Interests: Disclosures of conflicts of interest: M.J.W. Grant or contract from the American Heart Association (no. 18POST34030192); consulting fees from Segmed; support from Segmed for attending meetings and/or travel; stock or stock options in Segmed. H.R.R. Employed by NVIDIA. V.S. No relevant relationships.
(© 2022 by the Radiological Society of North America, Inc.)