학술논문

InTAD: chromosome conformation guided analysis of enhancer target genes
Document Type
article
Source
BMC Bioinformatics, Vol 20, Iss 1, Pp 1-7 (2019)
Subject
Epigenomics
Transcriptomics
Topologically associated domains
Enhancers
Computer applications to medicine. Medical informatics
R858-859.7
Biology (General)
QH301-705.5
Language
English
ISSN
1471-2105
Abstract
Abstract Background High-throughput technologies for analyzing chromosome conformation at a genome scale have revealed that chromatin is organized in topologically associated domains (TADs). While TADs are relatively stable across cell types, intra-TAD activities are cell type specific. Epigenetic profiling of different tissues and cell-types has identified a large number of non-coding epigenetic regulatory elements (‘enhancers’) that can be located far away from coding genes. Linear proximity is a commonly chosen criterion for associating enhancers with their potential target genes. While enhancers frequently regulate the closest gene, unambiguous identification of enhancer regulated genes remains to be a challenge in the absence of sample matched chromosome conformation data. Results To associate enhancers with their target genes, we have previously developed and applied a method that tests for significant correlations between enhancer and gene expressions across a cohort of samples. To limit the number of tests, we constrain this analysis to gene-enhancer pairs embedded in the same TAD, where information on TAD boundaries is borrowed from publicly available chromosome conformation capturing (‘Hi-C’) data. We have now implemented this method as an R Bioconductor package ‘InTAD’ and verified the software package by reanalyzing available enhancer and gene expression data derived from ependymoma brain tumors. Conclusion The open-source package InTAD is an easy-to-use software tool for identifying proximal and distal enhancer target genes by leveraging information on correlated expression of enhancers and genes that are located in the same TAD. InTAD can be applied to any heterogeneous cohort of samples analyzed by a combination of gene expression and epigenetic profiling techniques and integrates either public or custom information of TAD boundaries.