학술논문

Accuracy of ChatGPT generated diagnosis from patient's medical history and imaging findings in neuroradiology cases.
Document Type
Article
Source
Neuroradiology. Jan2024, Vol. 66 Issue 1, p73-79. 7p.
Subject
*NECK anatomy
*ARTIFICIAL intelligence tests
*BRAIN anatomy
*CENTRAL nervous system diseases
*SERIAL publications
*FISHER exact test
*HEAD
*COMPARATIVE studies
*MEDICAL history taking
*MEDICAL records
*CASE studies
*DESCRIPTIVE statistics
*SENSITIVITY & specificity (Statistics)
*NEURORADIOLOGY
*SPINE
RESEARCH evaluation
CENTRAL nervous system tumors
Language
ISSN
0028-3940
Abstract
Purpose: The noteworthy performance of Chat Generative Pre-trained Transformer (ChatGPT), an artificial intelligence text generation model based on the GPT-4 architecture, has been demonstrated in various fields; however, its potential applications in neuroradiology remain unexplored. This study aimed to evaluate the diagnostic performance of GPT-4 based ChatGPT in neuroradiology. Methods: We collected 100 consecutive "Case of the Week" cases from the American Journal of Neuroradiology between October 2021 and September 2023. ChatGPT generated a diagnosis from patient's medical history and imaging findings for each case. Then the diagnostic accuracy rate was determined using the published ground truth. Each case was categorized by anatomical location (brain, spine, and head & neck), and brain cases were further divided into central nervous system (CNS) tumor and non-CNS tumor groups. Fisher's exact test was conducted to compare the accuracy rates among the three anatomical locations, as well as between the CNS tumor and non-CNS tumor groups. Results: ChatGPT achieved a diagnostic accuracy rate of 50% (50/100 cases). There were no significant differences between the accuracy rates of the three anatomical locations (p = 0.89). The accuracy rate was significantly lower for the CNS tumor group compared to the non-CNS tumor group in the brain cases (16% [3/19] vs. 62% [36/58], p < 0.001). Conclusion: This study demonstrated the diagnostic performance of ChatGPT in neuroradiology. ChatGPT's diagnostic accuracy varied depending on disease etiologies, and its diagnostic accuracy was significantly lower in CNS tumors compared to non-CNS tumors. [ABSTRACT FROM AUTHOR]