학술논문

MedCat: A Framework for High Level Conceptualization of Medical Notes
Document Type
Conference
Source
2013 IEEE 13th International Conference on Data Mining Workshops Data Mining Workshops (ICDMW), 2013 IEEE 13th International Conference on. :274-280 Dec, 2013
Subject
Engineering Profession
General Topics for Engineers
Manuals
Feature extraction
Unified modeling language
Noise
Natural language processing
Sensitivity
Data mining
MedCat
PTSD
categorization
Natural Language Processing
Language
ISSN
2375-9232
2375-9259
Abstract
In this paper we introduce a new framework called Med Cat to delineate and demonstrate an approach for projecting representations of concept-derived content in clinical notes into a new categorization space to reduce dimensionality and noise in the data. Constructing Med Cat framework required several steps including manual annotation, knowledge base expansion using MetaMap, concept category construction, automated annotation using NLP to generate a bag of concepts, and finally concept conversion to higher level abstracted categories. The framework was applied to Post Traumatic Stress Disorder (PTSD) clinical notes for evaluation. A random sample of PTSD clinical note content was automatically recategorized into six PTSD treatment categories using Med Cat. Using existing annotations from PTSD notes that were categorized by content experts into treatment categories as the reference standard, the sensitivity of the framework in detecting the treatment categories was greater than 90%. The results suggest that representations of concept-derived content when categorized by relevance features can be used to reliably understand and summarize clinical notes.