학술논문

TWO-SIGMA-G: a new competitive gene set testing framework for scRNA-seq data accounting for inter-gene and cell–cell correlation.
Document Type
Article
Source
Briefings in Bioinformatics. May2022, Vol. 23 Issue 3, p1-16. 16p.
Subject
*ALZHEIMER'S disease
*FALSE positive error
*INFERENTIAL statistics
*DATA distribution
*REGRESSION analysis
*DISEASE progression
Language
ISSN
1467-5463
Abstract
We propose TWO-SIGMA-G, a competitive gene set test for scRNA-seq data. TWO-SIGMA-G uses a mixed-effects regression model based on our previously published TWO-SIGMA to test for differential expression at the gene-level. This regression-based model provides flexibility and rigor at the gene-level in (1) handling complex experimental designs, (2) accounting for the correlation between biological replicates and (3) accommodating the distribution of scRNA-seq data to improve statistical inference. Moreover, TWO-SIGMA-G uses a novel approach to adjust for inter-gene-correlation (IGC) at the set-level to control the set-level false positive rate. Simulations demonstrate that TWO-SIGMA-G preserves type-I error and increases power in the presence of IGC compared with other methods. Application to two datasets identified HIV-associated interferon pathways in xenograft mice and pathways associated with Alzheimer's disease progression in humans. [ABSTRACT FROM AUTHOR]