학술논문

Comethyl: a network-based methylome approach to investigate the multivariate nature of health and disease
Document Type
article
Source
Briefings in Bioinformatics. 23(2)
Subject
Biological Sciences
Bioinformatics and Computational Biology
Genetics
Mental Health
Autism
Human Genome
Intellectual and Developmental Disabilities (IDD)
Brain Disorders
Biotechnology
Neurosciences
Generic health relevance
Mental health
Good Health and Well Being
Autism Spectrum Disorder
CpG Islands
DNA Methylation
Epigenesis
Genetic
Epigenome
Genome-Wide Association Study
Humans
Infant
Newborn
Male
DNA methylation
whole-genome bisulfite sequencing
epigenetics
epigenome
weighted gene correlation network analysis
systems biology
autism spectrum disorder
Biochemistry and Cell Biology
Computation Theory and Mathematics
Other Information and Computing Sciences
Bioinformatics
Biochemistry and cell biology
Bioinformatics and computational biology
Language
Abstract
Health outcomes are frequently shaped by difficult to dissect inter-relationships between biological, behavioral, social and environmental factors. DNA methylation patterns reflect such multivariate intersections, providing a rich source of novel biomarkers and insight into disease etiologies. Recent advances in whole-genome bisulfite sequencing enable investigation of DNA methylation over all genomic CpGs, but existing bioinformatic approaches lack accessible system-level tools. Here, we develop the R package Comethyl, for weighted gene correlation network analysis of user-defined genomic regions that generates modules of comethylated regions, which are then tested for correlations with multivariate sample traits. First, regions are defined by CpG genomic location or regulatory annotation and filtered based on CpG count, sequencing depth and variability. Next, correlation networks are used to find modules of interconnected nodes using methylation values within the selected regions. Each module containing multiple comethylated regions is reduced in complexity to a single eigennode value, which is then tested for correlations with experimental metadata. Comethyl has the ability to cover the noncoding regulatory regions of the genome with high relevance to interpretation of genome-wide association studies and integration with other types of epigenomic data. We demonstrate the utility of Comethyl on a dataset of male cord blood samples from newborns later diagnosed with autism spectrum disorder (ASD) versus typical development. Comethyl successfully identified an ASD-associated module containing regions mapped to genes enriched for brain glial functions. Comethyl is expected to be useful in uncovering the multivariate nature of health disparities for a variety of common disorders. Comethyl is available at github.com/cemordaunt/comethyl with complete documentation and example analyses.