MLLGSTApr 24, 2020

Robust subgaussian estimation with VC-dimension

arXiv:2004.11734v315 citations
Originality Incremental advance
AI Analysis

This work provides incremental improvements in robust estimation for statisticians and machine learning practitioners by enabling subgaussian bounds under weaker assumptions.

The paper tackles the problem of robust statistical estimation with heavy-tailed or corrupted data by introducing a new method to bound excess risk for median-of-means estimators using VC-dimension instead of Rademacher complexity. This results in robust estimators for sparse estimation achieving subgaussian rates with only a finite second moment assumption, improving over prior methods that required more moments.

Median-of-means (MOM) based procedures provide non-asymptotic and strong deviation bounds even when data are heavy-tailed and/or corrupted. This work proposes a new general way to bound the excess risk for MOM estimators. The core technique is the use of VC-dimension (instead of Rademacher complexity) to measure the statistical complexity. In particular, this allows to give the first robust estimators for sparse estimation which achieves the so-called subgaussian rate only assuming a finite second moment for the uncorrupted data. By comparison, previous works using Rademacher complexities required a number of finite moments that grows logarithmically with the dimension. With this technique, we derive new robust sugaussian bounds for mean estimation in any norm. We also derive a new robust estimator for covariance estimation that is the first to achieve subgaussian bounds without $L_4-L_2$ norm equivalence.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes