학술논문

Local Minima Structures in Gaussian Mixture Models
Document Type
Periodical
Source
IEEE Transactions on Information Theory IEEE Trans. Inform. Theory Information Theory, IEEE Transactions on. 70(6):4218-4257 Jun, 2024
Subject
Communication, Networking and Broadcast Technologies
Signal Processing and Analysis
Sociology
Iterative methods
Gaussian mixture model
Mixture models
Approximation error
Standards
Maximum likelihood estimation
Gaussian mixture models
maximum likelihood estimation
nonconvex landscape
local minimizers
Language
ISSN
0018-9448
1557-9654
Abstract
We investigate the landscape of the negative log-likelihood function of Gaussian Mixture Models (GMMs) with a general number of components in the population limit. As the objective function is non-convex, there can exist multiple spurious local minima that are not globally optimal, even for well-separated mixture models. Our study reveals that all local minima share a common structure that partially identifies the cluster centers (i.e., means of the Gaussian components) of the true location mixture. Specifically, each local minimum can be represented as a non-overlapping combination of two types of sub-configurations: 1) fitting a single mean estimate to multiple Gaussian components; or 2) fitting multiple estimates to a single true component. These results apply to settings where the true mixture components satisfy a certain separation condition, and are valid even when the number of components is over- or under-specified. We also present a more fine-grained analysis for the setting of one-dimensional GMMs with three components, which provide sharper approximation error bounds with improved dependence on the separation parameter.