학술논문

Analysis commons, a team approach to discovery in a big-data environment for genetic epidemiology
Document Type
article
Source
Nature Genetics. 49(11)
Subject
Biological Sciences
Genetics
Biotechnology
Networking and Information Technology R&D (NITRD)
Lung
Human Genome
2.5 Research design and methodologies (aetiology)
Aetiology
Generic health relevance
Good Health and Well Being
Big Data
Fibrinogen
Genetics
Population
Genome
Humans
Information Dissemination
Mobile Applications
Molecular Epidemiology
Regression Analysis
Software
Workflow
NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium
Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium
TOPMed Hematology and Hemostasis Working Group
CHARGE Analysis and Bioinformatics Working Group
Medical and Health Sciences
Developmental Biology
Agricultural biotechnology
Bioinformatics and computational biology
Language
Abstract
The exploding volume of whole-genome sequence (WGS) and multi-omics data requires new approaches for analysis. As one solution, we have created a cloud-based Analysis Commons, which brings together genotype and phenotype data from multiple studies in a setting that is accessible by multiple investigators. This framework addresses many of the challenges of multi-center WGS analyses, including data sharing mechanisms, phenotype harmonization, integrated multi-omics analyses, annotation, and computational flexibility. In this setting, the computational pipeline facilitates a sequence-to-discovery analysis workflow illustrated here by an analysis of plasma fibrinogen levels in 3996 individuals from the National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) WGS program. The Analysis Commons represents a novel model for transforming WGS resources from a massive quantity of phenotypic and genomic data into knowledge of the determinants of health and disease risk in diverse human populations.