학술논문

Advances in statistical methodology and analysis in a study of ARC syndrome
Document Type
Electronic Thesis or Dissertation
Source
Subject
570.72
Language
English
Abstract
This thesis presents statistical analysis and methodology development for a systems analysis of ARC syndrome. ARC is a genetic disease caused by mutations in one of two proteins, VPS33B and VIPAS39, of whose function little is known. Transcriptomic and metabolomic data are analysed to identify differentially expressed genes and pathways, and to highlight processes which are perturbed. Results consistently point to processes involved in cell polarisation and cell-cell adhesion, which is corroborated by experimental work. Beneficial suggestions for future experimental work are included and have already yielded interesting results. Motivated by the desire to incorporate knowledge of genetic dependencies into this analysis, methodology is developed to enable Bayesian inference for ‘doublyintractable distributions’. These models have a likelihood normalising term which is a function of unknown model parameters and which cannot be computed. This means that standard methods for sampling from the posterior, such as Markov chain Monte Carlo (MCMC), cannot be used. In the developed method, the likelihood is expressed as an infinite series which is then stochastically truncated. These unbiased, but possibly negative, estimates can then be used in a Pseudo-marginal MCMC scheme to compute expectations with respect to the posterior. Finally, methodology is developed to enable unbiased estimation for models in which data can be generated but no tractable likelihood is available. The main motivation for this is stochastic kinetic models used to describe complex and heterogeneous biological systems, but models of this type can be found across the sciences. Approximate Bayesian Computation is used to define a sequence of consistent Monte Carlo estimates, and these are then combined to produce an estimator which is unbiased with respect to the true posterior. Both approaches are demonstrated on a range of examples followed by a critical assessment of their strengths and weaknesses.

Online Access