학술논문
Medical records-based chronic kidney disease phenotype for clinical care and 'big data' observational and genetic studies
Document Type
article
Author
Ning Shang; Atlas Khan; Fernanda Polubriaginof; Francesca Zanoni; Karla Mehl; David Fasel; Paul E. Drawz; Robert J. Carrol; Joshua C. Denny; Matthew A. Hathcock; Adelaide M. Arruda-Olson; Peggy L. Peissig; Richard A. Dart; Murray H. Brilliant; Eric B. Larson; David S. Carrell; Sarah Pendergrass; Shefali Setia Verma; Marylyn D. Ritchie; Barbara Benoit; Vivian S. Gainer; Elizabeth W. Karlson; Adam S. Gordon; Gail P. Jarvik; Ian B. Stanaway; David R. Crosslin; Sumit Mohan; Iuliana Ionita-Laza; Nicholas P. Tatonetti; Ali G. Gharavi; George Hripcsak; Chunhua Weng; Krzysztof Kiryluk
Source
npj Digital Medicine, Vol 4, Iss 1, Pp 1-13 (2021)
Subject
Language
English
ISSN
2398-6352
Abstract
Abstract Chronic Kidney Disease (CKD) represents a slowly progressive disorder that is typically silent until late stages, but early intervention can significantly delay its progression. We designed a portable and scalable electronic CKD phenotype to facilitate early disease recognition and empower large-scale observational and genetic studies of kidney traits. The algorithm uses a combination of rule-based and machine-learning methods to automatically place patients on the staging grid of albuminuria by glomerular filtration rate (“A-by-G” grid). We manually validated the algorithm by 451 chart reviews across three medical systems, demonstrating overall positive predictive value of 95% for CKD cases and 97% for healthy controls. Independent case-control validation using 2350 patient records demonstrated diagnostic specificity of 97% and sensitivity of 87%. Application of the phenotype to 1.3 million patients demonstrated that over 80% of CKD cases are undetected using ICD codes alone. We also demonstrated several large-scale applications of the phenotype, including identifying stage-specific kidney disease comorbidities, in silico estimation of kidney trait heritability in thousands of pedigrees reconstructed from medical records, and biobank-based multicenter genome-wide and phenome-wide association studies.