학술논문
Resource profile and user guide of the Polygenic Index Repository
Document Type
article
Author
Becker, Joel; Burik, Casper AP; Goldman, Grant; Wang, Nancy; Jayashankar, Hariharan; Bennett, Michael; Belsky, Daniel W; Karlsson Linnér, Richard; Ahlskog, Rafael; Kleinman, Aaron; Hinds, David A; Caspi, Avshalom; Corcoran, David L; Moffitt, Terrie E; Poulton, Richie; Sugden, Karen; Williams, Benjamin S; Harris, Kathleen Mullan; Steptoe, Andrew; Ajnakina, Olesya; Milani, Lili; Esko, Tõnu; Iacono, William G; McGue, Matt; Magnusson, Patrik KE; Mallard, Travis T; Harden, K Paige; Tucker-Drob, Elliot M; Herd, Pamela; Freese, Jeremy; Young, Alexander; Beauchamp, Jonathan P; Koellinger, Philipp D; Oskarsson, Sven; Johannesson, Magnus; Visscher, Peter M; Meyer, Michelle N; Laibson, David; Cesarini, David; Benjamin, Daniel J; Turley, Patrick; Okbay, Aysu
Source
Nature Human Behaviour. 5(12)
Subject
Language
Abstract
Polygenic indexes (PGIs) are DNA-based predictors. Their value for research in many scientific disciplines is growing rapidly. As a resource for researchers, we used a consistent methodology to construct PGIs for 47 phenotypes in 11 datasets. To maximize the PGIs' prediction accuracies, we constructed them using genome-wide association studies-some not previously published-from multiple data sources, including 23andMe and UK Biobank. We present a theoretical framework to help interpret analyses involving PGIs. A key insight is that a PGI can be understood as an unbiased but noisy measure of a latent variable we call the 'additive SNP factor'. Regressions in which the true regressor is this factor but the PGI is used as its proxy therefore suffer from errors-in-variables bias. We derive an estimator that corrects for the bias, illustrate the correction, and make a Python tool for implementing it publicly available.