On Learning Parallel Pancakes with Mostly Uniform Weights
This addresses the computational hardness of learning structured Gaussian mixtures for theoretical machine learning, with incremental contributions to understanding weight distribution effects.
The paper tackled the complexity of learning Gaussian mixture models with structural assumptions, showing that a quasi-polynomial upper bound for mixtures with uniform weights is essentially optimal via an SQ lower bound, and provided a quasi-polynomial upper bound for testing when most weights are uniform.
We study the complexity of learning $k$-mixtures of Gaussians ($k$-GMMs) on $\mathbb{R}^d$. This task is known to have complexity $d^{Ω(k)}$ in full generality. To circumvent this exponential lower bound on the number of components, research has focused on learning families of GMMs satisfying additional structural properties. A natural assumption posits that the component weights are not exponentially small and that the components have the same unknown covariance. Recent work gave a $d^{O(\log(1/w_{\min}))}$-time algorithm for this class of GMMs, where $w_{\min}$ is the minimum weight. Our first main result is a Statistical Query (SQ) lower bound showing that this quasi-polynomial upper bound is essentially best possible, even for the special case of uniform weights. Specifically, we show that it is SQ-hard to distinguish between such a mixture and the standard Gaussian. We further explore how the distribution of weights affects the complexity of this task. Our second main result is a quasi-polynomial upper bound for the aforementioned testing task when most of the weights are uniform while a small fraction of the weights are potentially arbitrary.