학술논문

A richly interactive exploratory data analysis and visualization tool using electronic medical records
Document Type
article
Source
BMC Medical Informatics and Decision Making. 15(1)
Subject
Networking and Information Technology R&D (NITRD)
Clinical Research
Bioengineering
Generic health relevance
Clinical Studies as Topic
Data Interpretation
Statistical
Electronic Health Records
Humans
Information Storage and Retrieval
National Health Programs
Pilot Projects
Taiwan
Information Systems
Clinical Sciences
Medical Informatics
Language
Abstract
BackgroundElectronic medical records (EMRs) contain vast amounts of data that is of great interest to physicians, clinical researchers, and medial policy makers. As the size, complexity, and accessibility of EMRs grow, the ability to extract meaningful information from them has become an increasingly important problem to solve.MethodsWe develop a standardized data analysis process to support cohort study with a focus on a particular disease. We use an interactive divide-and-conquer approach to classify patients into relatively uniform within each group. It is a repetitive process enabling the user to divide the data into homogeneous subsets that can be visually examined, compared, and refined. The final visualization was driven by the transformed data, and user feedback direct to the corresponding operators which completed the repetitive process. The output results are shown in a Sankey diagram-style timeline, which is a particular kind of flow diagram for showing factors' states and transitions over time.ResultsThis paper presented a visually rich, interactive web-based application, which could enable researchers to study any cohorts over time by using EMR data. The resulting visualizations help uncover hidden information in the data, compare differences between patient groups, determine critical factors that influence a particular disease, and help direct further analyses. We introduced and demonstrated this tool by using EMRs of 14,567 Chronic Kidney Disease (CKD) patients.ConclusionsWe developed a visual mining system to support exploratory data analysis of multi-dimensional categorical EMR data. By using CKD as a model of disease, it was assembled by automated correlational analysis and human-curated visual evaluation. The visualization methods such as Sankey diagram can reveal useful knowledge about the particular disease cohort and the trajectories of the disease over time.