학술논문

CORNAS: coverage-dependent RNA-Seq analysis of gene expression data without biological replicates.
Document Type
Article
Source
BMC Bioinformatics. 12/28/2017, Vol. 18, p243-253. 11p. 1 Chart, 4 Graphs.
Subject
*RNA sequencing
*GENE expression
*NUCLEOTIDE sequencing
*BAYESIAN analysis
*COMPUTER simulation
Language
ISSN
1471-2105
Abstract
Background: In current statistical methods for calling differentially expressed genes in RNA-Seq experiments, the assumption is that an adjusted observed gene count represents an unknown true gene count. This adjustment usually consists of a normalization step to account for heterogeneous sample library sizes, and then the resulting normalized gene counts are used as input for parametric or non-parametric differential gene expression tests. A distribution of true gene counts, each with a different probability, can result in the same observed gene count. Importantly, sequencing coverage information is currently not explicitly incorporated into any of the statistical models used for RNA-Seq analysis. Results: We developed a fast Bayesian method which uses the sequencing coverage information determined from the concentration of an RNA sample to estimate the posterior distribution of a true gene count. Our method has better or comparable performance compared to NOISeq and GFOLD, according to the results from simulations and experiments with real unreplicated data. We incorporated a previously unused sequencing coverage parameter into a procedure for differential gene expression analysis with RNA-Seq data. Conclusions: Our results suggest that our method can be used to overcome analytical bottlenecks in experiments with limited number of replicates and low sequencing coverage. [ABSTRACT FROM AUTHOR]