Multivariate mean estimation with direction-dependent accuracy
This work addresses a foundational problem in multivariate statistics for researchers, providing a theoretical guarantee with minimal assumptions, though it appears incremental as it builds on existing estimation frameworks.
The paper tackles the problem of estimating the mean of a random vector with direction-dependent accuracy, proving the existence of an estimator that achieves near-optimal error in all directions where variance is not too small, with a bound involving constants like C and terms such as σ(u)√log(1/δ).
We consider the problem of estimating the mean of a random vector based on $N$ independent, identically distributed observations. We prove the existence of an estimator that has a near-optimal error in all directions in which the variance of the one dimensional marginal of the random vector is not too small: with probability $1-δ$, the procedure returns $\whμ_N$ which satisfies that for every direction $u \in S^{d-1}$, \[ \inr{\whμ_N - μ, u}\le \frac{C}{\sqrt{N}} \left( σ(u)\sqrt{\log(1/δ)} + \left(\E\|X-\EXP X\|_2^2\right)^{1/2} \right)~, \] where $σ^2(u) = \var(\inr{X,u})$ and $C$ is a constant. To achieve this, we require only slightly more than the existence of the covariance matrix, in the form of a certain moment-equivalence assumption. The proof relies on novel bounds for the ratio of empirical and true probabilities that hold uniformly over certain classes of random variables.