학술논문
A global reference for human genetic variation
Document Type
article
Author
Auton, Adam; Abecasis, Gonçalo R; Altshuler, David M; Durbin, Richard M; Bentley, David R; Chakravarti, Aravinda; Clark, Andrew G; Donnelly, Peter; Eichler, Evan E; Flicek, Paul; Gabriel, Stacey B; Gibbs, Richard A; Green, Eric D; Hurles, Matthew E; Knoppers, Bartha M; Korbel, Jan O; Lander, Eric S; Lee, Charles; Lehrach, Hans; Mardis, Elaine R; Marth, Gabor T; McVean, Gil A; Nickerson, Deborah A; Schmidt, Jeanette P; Sherry, Stephen T; Wang, Jun; Wilson, Richard K; Barnes, Kathleen C; Beiswanger, Christine; Burchard, Esteban G; Bustamante, Carlos D; Cai, Hongyu; Cao, Hongzhi; Gerry, Norman P; Gharani, Neda; Gignoux, Christopher R; Gravel, Simon; Henn, Brenna; Jones, Danielle; Jorde, Lynn; Kaye, Jane S; Keinan, Alon; Kent, Alastair; Kerasidou, Angeliki; Li, Yingrui; Mathias, Rasika; Moreno-Estrada, Andres; Ossorio, Pilar N; Parker, Michael; Resch, Alissa M; Rotimi, Charles N; Royal, Charmaine D; Sandoval, Karla; Su, Yeyang; Sudbrak, Ralf; Tian, Zhongming; Tishkoff, Sarah; Toji, Lorraine H; Tyler-Smith, Chris; Via, Marc; Wang, Yuhong; Yang, Huanming; Yang, Ling; Zhu, Jiayong; Brooks, Lisa D; Felsenfeld, Adam L; McEwen, Jean E; Vaydylevich, Yekaterina; Duncanson, Audrey; Dunn, Michael; Schloss, Jeffery A; Garrison, Erik P; Min Kang, Hyun; Marchini, Jonathan L; McCarthy, Shane
Source
Nature. 526(7571)
Subject
Language
Abstract
The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.