DS LG ST MLJan 9, 2025

Entangled Mean Estimation in High-Dimensions

Ilias Diakonikolas, Daniel M. Kane, Sihan Liu, Thanasis Pittas

arXiv:2501.05425v13.33 citationsh-index: 48STOC

Originality Incremental advance

AI Analysis

This work addresses a foundational statistical estimation problem in high-dimensional settings, with potential applications in robust statistics and machine learning, though it appears incremental by extending one-dimensional results to multivariate cases.

The paper tackles the problem of high-dimensional entangled mean estimation in the subset-of-signals model, where an unknown fraction of points have identity-bounded covariances, and it designs a computationally efficient algorithm achieving a near-optimal error rate of f(α,N) + √(D/(αN)) up to polylogarithmic factors.

We study the task of high-dimensional entangled mean estimation in the subset-of-signals model. Specifically, given $N$ independent random points $x_1,\ldots,x_N$ in $\mathbb{R}^D$ and a parameter $α\in (0, 1)$ such that each $x_i$ is drawn from a Gaussian with mean $μ$ and unknown covariance, and an unknown $α$-fraction of the points have identity-bounded covariances, the goal is to estimate the common mean $μ$. The one-dimensional version of this task has received significant attention in theoretical computer science and statistics over the past decades. Recent work [LY20; CV24] has given near-optimal upper and lower bounds for the one-dimensional setting. On the other hand, our understanding of even the information-theoretic aspects of the multivariate setting has remained limited. In this work, we design a computationally efficient algorithm achieving an information-theoretically near-optimal error. Specifically, we show that the optimal error (up to polylogarithmic factors) is $f(α,N) + \sqrt{D/(αN)}$, where the term $f(α,N)$ is the error of the one-dimensional problem and the second term is the sub-Gaussian error rate. Our algorithmic approach employs an iterative refinement strategy, whereby we progressively learn more accurate approximations $\hat μ$ to $μ$. This is achieved via a novel rejection sampling procedure that removes points significantly deviating from $\hat μ$, as an attempt to filter out unusually noisy samples. A complication that arises is that rejection sampling introduces bias in the distribution of the remaining points. To address this issue, we perform a careful analysis of the bias, develop an iterative dimension-reduction strategy, and employ a novel subroutine inspired by list-decodable learning that leverages the one-dimensional result.

View on arXiv PDF

Similar