학술논문

Text mining using clinical terms in electronic records of annual falls of patients in home community care
Document Type
Conference
Source
2023 IEEE International Conference on Big Data (BigData) Big Data (BigData), 2023 IEEE International Conference on. :6122-6124 Dec, 2023
Subject
Bioengineering
Computing and Processing
Geoscience
Robotics and Control Systems
Signal Processing and Analysis
Text mining
Training
Aspirin
Frequency-domain analysis
Mouth
Big Data
Bioinformatics
clinical notes
clinical terms
electronic health records
health big data
text mining
Language
Abstract
The number of health informatics in text in electronic health records (EHRs) has increased. In parallel, the text mining of health informatics in clinical notes has also increased in the attempt to utilize big data in EHRs for better patient care. However, little is known on how to utilize the word frequencies of text in clinical notes in EHRs to investigate patient care over a per annual basis.Our methods used a tool called NimbleMiner (NM) that utilized RStudio and generate a relatively “naïve” native lexicon of Simclins (i.e., SIMilar CLINical terms) via the applications with special emphasis on patient falls, which are important events that can lead to decreased health outcomes. The identified Simclins’ frequencies in the file were then determined using a basic Python script for 1-gram words (e.g., “home”) and using Voyant Tools for n+1-gram words (e.g., “living room” or “intensive care unit”).The most frequent words (over an annual basis) in the training corpus were determined to be: “mg” (15,726); “patient” (12,945); “tablet” (10,992); “po” or per os or by mouth (8,904); “blood” (8,824). The Simclins were identified for each domain of interest, the initial prompts for the first search iteration of NM, as well as ChatGPT3.5 output for comparison purposes. There were 15,252 Simclin word frequency related to home and 2,421 related to clinical settings over one year, respectively. Simclin frequencies per year by domain were “fall” (514), “setting” (17,673), computerized provider order entry – CPOE (3,408), and medications (62,317). Furthermore, the significantly larger text of 33,769 on medications (i.e., mg, tablet, medications, aspirin, coumadin, dose, doses, prescribed, and unit_ml_solution) of the total of 46,530 or 72.6% of the corpus.The use of NM and other tools proved that text mining of the clinical notes can provide operational information of important clinical events based on word Simclin frequencies in the clinical notes over an annual basis. The results showed importance of lower frequency words like “fall” as well as higher frequency terms such as “medications” and “prescriptions”. More investigation is needed in terms using Simclins across the health big data of clinical notes of EHRs to better understand the use of text in patient care.