DSLGSTMLDec 3, 2020

Robustly Learning Mixtures of $k$ Arbitrary Gaussians

arXiv:2012.02119v321 citations
AI Analysis

This work provides a robust estimation algorithm for mixtures of arbitrary Gaussians, which is a fundamental problem in machine learning and statistics, benefiting researchers and practitioners dealing with noisy or adversarial data.

This paper presents a polynomial-time algorithm for robustly estimating a mixture of k arbitrary Gaussians in d dimensions, even with a constant fraction of arbitrary corruptions. This resolves a significant open problem in algorithmic robust statistics.

We give a polynomial-time algorithm for the problem of robustly estimating a mixture of $k$ arbitrary Gaussians in $\mathbb{R}^d$, for any fixed $k$, in the presence of a constant fraction of arbitrary corruptions. This resolves the main open problem in several previous works on algorithmic robust statistics, which addressed the special cases of robustly estimating (a) a single Gaussian, (b) a mixture of TV-distance separated Gaussians, and (c) a uniform mixture of two Gaussians. Our main tools are an efficient \emph{partial clustering} algorithm that relies on the sum-of-squares method, and a novel \emph{tensor decomposition} algorithm that allows errors in both Frobenius norm and low-rank terms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes