학술논문

DeepWAS: Multivariate genotype-phenotype associations by directly integrating regulatory information using deep learning.
Document Type
Article
Source
PLoS Computational Biology. 2/3/2020, Vol. 16 Issue 2, p1-28. 28p. 4 Diagrams, 1 Chart, 1 Graph.
Subject
*MENTAL depression
*STATURE
*CONCEPT mapping
*CIS-regulatory elements (Genetics)
*DEEP learning
*MULTIPLE sclerosis
*UNIVARIATE analysis
Language
ISSN
1553-734X
Abstract
Genome-wide association studies (GWAS) identify genetic variants associated with traits or diseases. GWAS never directly link variants to regulatory mechanisms. Instead, the functional annotation of variants is typically inferred by post hoc analyses. A specific class of deep learning-based methods allows for the prediction of regulatory effects per variant on several cell type-specific chromatin features. We here describe "DeepWAS", a new approach that integrates these regulatory effect predictions of single variants into a multivariate GWAS setting. Thereby, single variants associated with a trait or disease are directly coupled to their impact on a chromatin feature in a cell type. Up to 61 regulatory SNPs, called dSNPs, were associated with multiple sclerosis (MS, 4,888 cases and 10,395 controls), major depressive disorder (MDD, 1,475 cases and 2,144 controls), and height (5,974 individuals). These variants were mainly non-coding and reached at least nominal significance in classical GWAS. The prediction accuracy was higher for DeepWAS than for classical GWAS models for 91% of the genome-wide significant, MS-specific dSNPs. DSNPs were enriched in public or cohort-matched expression and methylation quantitative trait loci and we demonstrated the potential of DeepWAS to generate testable functional hypotheses based on genotype data alone. DeepWAS is available at https://github.com/cellmapslab/DeepWAS. Author summary: In the era of steadily increasing amounts of available genetic data, we still lack novel and innovative ideas on how to improve fine-mapping of regulatory variants identified by genome-wide association studies (GWAS), especially in non-coding regions. Current approaches for the identification of functional variants conduct functional annotation after the GWAS analysis either using position-based overlaps of each variant with regulatory elements or deep-learning-based methods predicting regulatory effects per variant on cell-type-specific chromatin features. We here present DeepWAS, which integrates these regulatory effect predictions of single variants into a multivariate GWAS setting. Our results provide evidence that DeepWAS results directly identify disease/trait-associated SNPs with a common effect on a specific chromatin feature in a relevant tissue. We can show for multiple sclerosis, major depressive disorder, and body height, that the SNPs identified by DeepWAS are at least nominally significant in classical univariate GWAS analysis of the same cohorts or larger published GWAS. By integrating expression and methylation quantitative trait loci (eQTL and meQTL) information of multiple resources and tissues, we can show that DeepWAS identifies disease/trait-relevant transcriptionally active genomic loci. We demonstrate that DeepWAS identifies both known variants and highlights underlying molecular mechanisms. [ABSTRACT FROM AUTHOR]