Fast Mean Estimation with Sub-Gaussian Rates
This provides a more computationally efficient solution for robust mean estimation in high-dimensional statistics, applicable in scenarios with heavy-tailed data, though it is incremental over existing polynomial-time estimators.
The paper tackles the problem of estimating the mean of a random vector with minimal assumptions, achieving error bounds matching sub-Gaussian rates while requiring only finite mean and covariance. The result is an estimator with runtime O(n^4 + n^2d), which is faster than prior methods and maintains optimal statistical efficiency.
We propose an estimator for the mean of a random vector in $\mathbb{R}^d$ that can be computed in time $O(n^4+n^2d)$ for $n$ i.i.d.~samples and that has error bounds matching the sub-Gaussian case. The only assumptions we make about the data distribution are that it has finite mean and covariance; in particular, we make no assumptions about higher-order moments. Like the polynomial time estimator introduced by Hopkins, 2018, which is based on the sum-of-squares hierarchy, our estimator achieves optimal statistical efficiency in this challenging setting, but it has a significantly faster runtime and a simpler analysis.