ST DS LGJan 21, 2023

Statistically Optimal Robust Mean and Covariance Estimation for Anisotropic Gaussians

ETH Zurich

arXiv:2301.09024v12.311 citationsh-index: 15

Originality Highly original

AI Analysis

This solves an open problem in robust statistics by providing statistically optimal estimators for contaminated Gaussian data, with applications in machine learning and data analysis.

The paper tackles robust estimation of mean and covariance for anisotropic Gaussians under strong contamination, achieving dimension-free bounds with optimal dependence on contamination level and sample size, as shown by error bounds scaling with operator norms and trace terms.

Assume that $X_{1}, \ldots, X_{N}$ is an $\varepsilon$-contaminated sample of $N$ independent Gaussian vectors in $\mathbb{R}^d$ with mean $μ$ and covariance $Σ$. In the strong $\varepsilon$-contamination model we assume that the adversary replaced an $\varepsilon$ fraction of vectors in the original Gaussian sample by any other vectors. We show that there is an estimator $\widehat μ$ of the mean satisfying, with probability at least $1 - δ$, a bound of the form \[ \|\widehatμ - μ\|_2 \le c\left(\sqrt{\frac{\operatorname{Tr}(Σ)}{N}} + \sqrt{\frac{\|Σ\|\log(1/δ)}{N}} + \varepsilon\sqrt{\|Σ\|}\right), \] where $c > 0$ is an absolute constant and $\|Σ\|$ denotes the operator norm of $Σ$. In the same contaminated Gaussian setup, we construct an estimator $\widehat Σ$ of the covariance matrix $Σ$ that satisfies, with probability at least $1 - δ$, \[ \left\|\widehatΣ - Σ\right\| \le c\left(\sqrt{\frac{\|Σ\|\operatorname{Tr}(Σ)}{N}} + \|Σ\|\sqrt{\frac{\log(1/δ)}{N}} + \varepsilon\|Σ\|\right). \] Both results are optimal up to multiplicative constant factors. Despite the recent significant interest in robust statistics, achieving both dimension-free bounds in the canonical Gaussian case remained open. In fact, several previously known results were either dimension-dependent and required $Σ$ to be close to identity, or had a sub-optimal dependence on the contamination level $\varepsilon$. As a part of the analysis, we derive sharp concentration inequalities for central order statistics of Gaussian, folded normal, and chi-squared distributions.

View on arXiv PDF

Similar