학술논문

THUNDER: A reference-free deconvolution method to infer cell type proportions from bulk Hi-C data.
Document Type
Article
Source
PLoS Genetics. 3/8/2022, Vol. 18 Issue 3, p1-18. 18p.
Subject
*DECONVOLUTION (Mathematics)
*CHROMATIN
*CHROMOSOMES
*DATA analysis
Language
ISSN
1553-7390
Abstract
Hi-C data provide population averaged estimates of three-dimensional chromatin contacts across cell types and states in bulk samples. Effective analysis of Hi-C data entails controlling for the potential confounding factor of differential cell type proportions across heterogeneous bulk samples. We propose a novel unsupervised deconvolution method for inferring cell type composition from bulk Hi-C data, the Two-step Hi-c UNsupervised DEconvolution appRoach (THUNDER). We conducted extensive simulations to test THUNDER based on combining two published single-cell Hi-C (scHi-C) datasets. THUNDER more accurately estimates the underlying cell type proportions compared to reference-free methods (e.g., TOAST, and NMF) and is more robust than reference-dependent methods (e.g. MuSiC). We further demonstrate the practical utility of THUNDER to estimate cell type proportions and identify cell-type-specific interactions in Hi-C data from adult human cortex tissue samples. THUNDER will be a useful tool in adjusting for varying cell type composition in population samples, facilitating valid and more powerful downstream analysis such as differential chromatin organization studies. Additionally, THUNDER estimated contact profiles provide a useful exploratory framework to investigate cell-type-specificity of the chromatin interactome while experimental data is still rare. Author summary: Hi-C data is used to understand how chromosomes are ordered in a cell. Often, this data is made up of different kinds of cells. Usually, we do not know the number of each kind of cell in the data. When we study Hi-C data, we must learn which part of the data comes from each kind of cell. If not, what we learn in our study might be wrong. As of now, there is no such approach that can do this for Hi-C data. We created an approach, THUNDER, to learn the parts of Hi-C data for all the kinds of cells in our data. We showed that our approach learns well in many settings using data we created. We then used our approach on real Hi-C data to show how others can use it in their work. [ABSTRACT FROM AUTHOR]