학술논문

Accuracy of ChatGPT generated diagnosis from patient's medical history and imaging findings in neuroradiology cases.

Document Type

Article

Author

Horiuchi, Daisuke; Tatekawa, Hiroyuki; Shimono, Taro; Walston, Shannon L; Takita, Hirotaka; Matsushita, Shu; Oura, Tatsushi; Mitsuyama, Yasuhito; Miki, Yukio; Ueda, Daiju

Source

Neuroradiology. Jan2024, Vol. 66 Issue 1, p73-79. 7p.

Subject

*Neck anatomy
*Artificial intelligence tests
*Brain anatomy
*Central nervous system diseases
*Serial publications
*Fisher exact test
*Head
*Comparative studies
*Medical history taking
*Medical records
*Case studies
*Descriptive statistics
*Sensitivity & specificity (Statistics)
*Neuroradiology
*Spine
Research evaluation
Central nervous system tumors

Language

ISSN

0028-3940

Abstract

Purpose: The noteworthy performance of Chat Generative Pre-trained Transformer (ChatGPT), an artificial intelligence text generation model based on the GPT-4 architecture, has been demonstrated in various fields; however, its potential applications in neuroradiology remain unexplored. This study aimed to evaluate the diagnostic performance of GPT-4 based ChatGPT in neuroradiology. Methods: We collected 100 consecutive "Case of the Week" cases from the American Journal of Neuroradiology between October 2021 and September 2023. ChatGPT generated a diagnosis from patient's medical history and imaging findings for each case. Then the diagnostic accuracy rate was determined using the published ground truth. Each case was categorized by anatomical location (brain, spine, and head & neck), and brain cases were further divided into central nervous system (CNS) tumor and non-CNS tumor groups. Fisher's exact test was conducted to compare the accuracy rates among the three anatomical locations, as well as between the CNS tumor and non-CNS tumor groups. Results: ChatGPT achieved a diagnostic accuracy rate of 50% (50/100 cases). There were no significant differences between the accuracy rates of the three anatomical locations (p = 0.89). The accuracy rate was significantly lower for the CNS tumor group compared to the non-CNS tumor group in the brain cases (16% [3/19] vs. 62% [36/58], p < 0.001). Conclusion: This study demonstrated the diagnostic performance of ChatGPT in neuroradiology. ChatGPT's diagnostic accuracy varied depending on disease etiologies, and its diagnostic accuracy was significantly lower in CNS tumors compared to non-CNS tumors. [ABSTRACT FROM AUTHOR]

Online Access

EBSCOHost PDF Full Text (ProQuest Central) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송