학술논문

Dirichlet Process Mixture of Mixtures Model for Unsupervised Subword Modeling
Document Type
Article
Source
IEEE-ACM Transactions on Audio, Speech, and Language Processing; November 2018, Vol. 26 Issue: 11 p2027-2042, 16p
Subject
Language
ISSN
23299290
Abstract
We develop a parallelizable Markov chain Monte Carlo sampler for a Dirichlet process mixture of mixtures model. Our sampler jointly infers a codebook and clusters. The codebook is a global collection of components. Clusters are mixtures, defined over the codebook. We combine a nonergodic Gibbs sampler with two layers of split and merge samplers on codebook and mixture level to form a valid ergodic chain. We design an additional switch sampler for components that supports convergence in our experimental results. In the use case of unsupervised subword modeling, we show that our method infers complex classes from real speech feature vectors that consistently show higher quality on several evaluation metrics. At the same time, we infer fewer classes that represent subword units more consistently and show longer durations, compared to a standard Dirichlet process mixture model sampler.