A Method of Moments for Mixture Models and Hidden Markov Models
This addresses the issue of high computational and sample complexity in mixture model estimation for applied statistics and machine learning, offering a practical alternative to EM, though it appears incremental as it builds on existing method of moments ideas.
The paper tackles the problem of parameter estimation for high-dimensional mixture models and hidden Markov models, which often rely on local search heuristics like EM that are prone to failure, by developing an efficient method of moments approach that achieves rigorous unsupervised learning results not previously attained.
Mixture models are a fundamental tool in applied statistics and machine learning for treating data taken from multiple subpopulations. The current practice for estimating the parameters of such models relies on local search heuristics (e.g., the EM algorithm) which are prone to failure, and existing consistent methods are unfavorable due to their high computational and sample complexity which typically scale exponentially with the number of mixture components. This work develops an efficient method of moments approach to parameter estimation for a broad class of high-dimensional mixture models with many components, including multi-view mixtures of Gaussians (such as mixtures of axis-aligned Gaussians) and hidden Markov models. The new method leads to rigorous unsupervised learning results for mixture models that were not achieved by previous works; and, because of its simplicity, it offers a viable alternative to EM for practical deployment.