학술논문

dsRID: in silico identification of dsRNA regions using long-read RNA-seq data
Document Type
article
Source
Bioinformatics. 39(11)
Subject
Biological Sciences
Bioinformatics and Computational Biology
Genetics
Alzheimer's Disease including Alzheimer's Disease Related Dementias (AD/ADRD)
Human Genome
Alzheimer's Disease
Aging
Brain Disorders
Dementia
Neurosciences
Neurodegenerative
Acquired Cognitive Impairment
1.1 Normal biological development and functioning
Underpinning research
Humans
RNA
Double-Stranded
RNA-Seq
Sequence Analysis
RNA
Base Sequence
Genome
Software
Mathematical Sciences
Information and Computing Sciences
Bioinformatics
Biological sciences
Information and computing sciences
Mathematical sciences
Language
Abstract
MotivationDouble-stranded RNAs (dsRNAs) are potent triggers of innate immune responses upon recognition by cytosolic dsRNA sensor proteins. Identification of endogenous dsRNAs helps to better understand the dsRNAome and its relevance to innate immunity related to human diseases.ResultsHere, we report dsRID (double-stranded RNA identifier), a machine-learning-based method to predict dsRNA regions in silico, leveraging the power of long-read RNA-sequencing (RNA-seq) and molecular traits of dsRNAs. Using models trained with PacBio long-read RNA-seq data derived from Alzheimer's disease (AD) brain, we show that our approach is highly accurate in predicting dsRNA regions in multiple datasets. Applied to an AD cohort sequenced by the ENCODE consortium, we characterize the global dsRNA profile with potentially distinct expression patterns between AD and controls. Together, we show that dsRID provides an effective approach to capture global dsRNA profiles using long-read RNA-seq data.Availability and implementationSoftware implementation of dsRID, and genomic coordinates of regions predicted by dsRID in all samples are available at the GitHub repository: https://github.com/gxiaolab/dsRID.