DSLGSTMLNov 29, 2022

Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions

CMU
arXiv:2211.16333v114 citationsh-index: 48
Originality Highly original
AI Analysis

It addresses a fundamental statistical problem for machine learning and data analysis, providing the first efficient algorithm for this heavy-tailed setting, which is incremental over prior light-tailed work.

The paper tackles outlier-robust mean estimation for high-dimensional heavy-tailed distributions with sparsity, achieving optimal asymptotic error with sample complexity scaling logarithmically in dimension and additively in failure probability.

We study the fundamental task of outlier-robust mean estimation for heavy-tailed distributions in the presence of sparsity. Specifically, given a small number of corrupted samples from a high-dimensional heavy-tailed distribution whose mean $μ$ is guaranteed to be sparse, the goal is to efficiently compute a hypothesis that accurately approximates $μ$ with high probability. Prior work had obtained efficient algorithms for robust sparse mean estimation of light-tailed distributions. In this work, we give the first sample-efficient and polynomial-time robust sparse mean estimator for heavy-tailed distributions under mild moment assumptions. Our algorithm achieves the optimal asymptotic error using a number of samples scaling logarithmically with the ambient dimension. Importantly, the sample complexity of our method is optimal as a function of the failure probability $τ$, having an additive $\log(1/τ)$ dependence. Our algorithm leverages the stability-based approach from the algorithmic robust statistics literature, with crucial (and necessary) adaptations required in our setting. Our analysis may be of independent interest, involving the delicate design of a (non-spectral) decomposition for positive semi-definite matrices satisfying certain sparsity properties.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes