학술논문

Poisson-Tweedie mixed-effects model: a flexible approach for the analysis of longitudinal RNA-seq data
Document Type
Working Paper
Source
Statistical Modelling (2020)
Subject
Statistics - Methodology
Statistics - Applications
Statistics - Computation
Language
Abstract
We present a new modelling approach for longitudinal count data that is motivated by the increasing availability of longitudinal RNA-sequencing experiments. The distribution of RNA-seq counts typically exhibits overdispersion, zero-inflation and heavy tails; moreover, in longitudinal designs repeated measurements from the same subject are typically (positively) correlated. We propose a generalized linear mixed model based on the Poisson-Tweedie distribution that can flexibly handle each of the aforementioned features of longitudinal overdispersed counts. We develop a computational approach to accurately evaluate the likelihood of the proposed model and to perform maximum likelihood estimation. Our approach is implemented in the R package ptmixed, which can be freely downloaded from CRAN. We assess the performance of ptmixed on simulated data and we present an application to a dataset with longitudinal RNA-sequencing measurements from healthy and dystrophic mice. The applicability of the Poisson-Tweedie mixed-effects model is not restricted to longitudinal RNA-seq data, but it extends to any scenario where non-independent measurements of a discrete overdispersed response variable are available.
Comment: The final (published) version of the article can be downloaded for free (Open Access) from the editor's website (click on the DOI link below). Link to the R package ptmixed: https://cran.r-project.org/web/packages/ptmixed/index.html