LGDSSTMLMar 15, 2024

Robust Sparse Estimation for Gaussians with Optimal Error under Huber Contamination

CMU
arXiv:2403.10416v11 citationsh-index: 48ICML
Originality Highly original
AI Analysis

This solves the problem of robust sparse estimation with optimal error for statisticians and machine learning practitioners, representing a significant advance over prior efficient methods.

The paper tackles robust sparse estimation for Gaussians under Huber contamination, providing the first sample and computationally efficient estimators with optimal error guarantees for mean estimation, PCA, and linear regression, achieving ℓ₂-error O(ε) compared to prior efficient algorithms with Ω(ε√log(1/ε)) error.

We study Gaussian sparse estimation tasks in Huber's contamination model with a focus on mean estimation, PCA, and linear regression. For each of these tasks, we give the first sample and computationally efficient robust estimators with optimal error guarantees, within constant factors. All prior efficient algorithms for these tasks incur quantitatively suboptimal error. Concretely, for Gaussian robust $k$-sparse mean estimation on $\mathbb{R}^d$ with corruption rate $ε>0$, our algorithm has sample complexity $(k^2/ε^2)\mathrm{polylog}(d/ε)$, runs in sample polynomial time, and approximates the target mean within $\ell_2$-error $O(ε)$. Previous efficient algorithms inherently incur error $Ω(ε\sqrt{\log(1/ε)})$. At the technical level, we develop a novel multidimensional filtering method in the sparse regime that may find other applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes