학술논문

Resource profile and user guide of the Polygenic Index Repository
Document Type
article
Source
Nature Human Behaviour. 5(12)
Subject
Epidemiology
Health Sciences
2.6 Resources and infrastructure (aetiology)
Aetiology
Data Analysis
Databases
Genetic
Genome-Wide Association Study
Humans
Multifactorial Inheritance
Polymorphism
Single Nucleotide
Polygenic score
Polygenic index
Repository
Measurement error
23andMe Research Group
Biomedical and clinical sciences
Health sciences
Psychology
Language
Abstract
Polygenic indexes (PGIs) are DNA-based predictors. Their value for research in many scientific disciplines is growing rapidly. As a resource for researchers, we used a consistent methodology to construct PGIs for 47 phenotypes in 11 datasets. To maximize the PGIs' prediction accuracies, we constructed them using genome-wide association studies-some not previously published-from multiple data sources, including 23andMe and UK Biobank. We present a theoretical framework to help interpret analyses involving PGIs. A key insight is that a PGI can be understood as an unbiased but noisy measure of a latent variable we call the 'additive SNP factor'. Regressions in which the true regressor is this factor but the PGI is used as its proxy therefore suffer from errors-in-variables bias. We derive an estimator that corrects for the bias, illustrate the correction, and make a Python tool for implementing it publicly available.