학술논문

Scalable gradients enable Hamiltonian Monte Carlo sampling for phylodynamic inference under episodic birth-death-sampling models.
Document Type
Article
Source
PLoS Computational Biology. 3/29/2024, Vol. 20 Issue 3, p1-23. 23p.
Subject
*DISTRIBUTION (Probability theory)
*INFECTIOUS disease transmission
*SEASONAL influenza
*COMMUNICABLE diseases
*INFLUENZA viruses
Language
ISSN
1553-734X
Abstract
Birth-death models play a key role in phylodynamic analysis for their interpretation in terms of key epidemiological parameters. In particular, models with piecewise-constant rates varying at different epochs in time, to which we refer as episodic birth-death-sampling (EBDS) models, are valuable for their reflection of changing transmission dynamics over time. A challenge, however, that persists with current time-varying model inference procedures is their lack of computational efficiency. This limitation hinders the full utilization of these models in large-scale phylodynamic analyses, especially when dealing with high-dimensional parameter vectors that exhibit strong correlations. We present here a linear-time algorithm to compute the gradient of the birth-death model sampling density with respect to all time-varying parameters, and we implement this algorithm within a gradient-based Hamiltonian Monte Carlo (HMC) sampler to alleviate the computational burden of conducting inference under a wide variety of structures of, as well as priors for, EBDS processes. We assess this approach using three different real world data examples, including the HIV epidemic in Odesa, Ukraine, seasonal influenza A/H3N2 virus dynamics in New York state, America, and Ebola outbreak in West Africa. HMC sampling exhibits a substantial efficiency boost, delivering a 10- to 200-fold increase in minimum effective sample size per unit-time, in comparison to a Metropolis-Hastings-based approach. Additionally, we show the robustness of our implementation in both allowing for flexible prior choices and in modeling the transmission dynamics of various pathogens by accurately capturing the changing trend of viral effective reproductive number. Author summary: Epidemic control and forecasting relies on accurate quantification of transmission and recovery dynamics. This quantification is achievable through the analysis of phylogenetic relationships among pathogen strains obtained from infected individuals. As a key analytical tool for such inference, we concentrate on the study of the episodic birth-death-sampling (EBDS) models. These models define a probability distribution on time-calibrated phylogenies that enable estimation of time-varying rates of pathogens' transmission, recovery, and sampling. Advances in sequencing technology, however, have led to an increasing amount of genetic data collected from these pathogens. Consequently, the traditional computational methods for analyzing these large-scale data under EBDS models have become inadequate due to their high computational load. We aimed to break this computational bottleneck by developing a new approach, based on the Hamiltonian Monte Carlo sampling, that considerably accelerates inference of all rate parameters compared to the traditional random-walk Metropolis-Hasting method. Our method greatly improves the ability to explore the complex distributions that arise when we want to understand disease dynamics at a more granular temporal resolution. This advancement delivers value to public health as it helps rapid data-driven decision-making during outbreaks and enhances our understanding of the spread of infectious diseases. [ABSTRACT FROM AUTHOR]