DS LG MLMay 23, 2024

Efficient Certificates of Anti-Concentration Beyond Gaussians

Ainesh Bakshi, Pravesh Kothari, Goutham Rajendran, Madhur Tulsiani, Aravindan Vijayaraghavan

arXiv:2405.15084v23.33 citationsh-index: 21FOCS

Originality Incremental advance

AI Analysis

This work addresses a limitation in algorithmic robust statistics by enabling anti-concentration certificates for a broader class of distributions, which is incremental but extends existing results to non-rotationally invariant settings.

The paper tackles the problem of constructing efficient certificates of anti-concentration for non-Gaussian distributions, such as bounded product distributions and uniform distributions over L_p balls, using a new formulation and sum-of-squares methods, resulting in quasi-polynomial time verifiable certificates that extend applications in list-decodable learning and clustering beyond Gaussian distributions.

A set of high dimensional points $X=\{x_1, x_2,\ldots, x_n\} \subset R^d$ in isotropic position is said to be $δ$-anti concentrated if for every direction $v$, the fraction of points in $X$ satisfying $|\langle x_i,v \rangle |\leq δ$ is at most $O(δ)$. Motivated by applications to list-decodable learning and clustering, recent works have considered the problem of constructing efficient certificates of anti-concentration in the average case, when the set of points $X$ corresponds to samples from a Gaussian distribution. Their certificates played a crucial role in several subsequent works in algorithmic robust statistics on list-decodable learning and settling the robust learnability of arbitrary Gaussian mixtures, yet remain limited to rotationally invariant distributions. This work presents a new (and arguably the most natural) formulation for anti-concentration. Using this formulation, we give quasi-polynomial time verifiable sum-of-squares certificates of anti-concentration that hold for a wide class of non-Gaussian distributions including anti-concentrated bounded product distributions and uniform distributions over $L_p$ balls (and their affine transformations). Consequently, our method upgrades and extends results in algorithmic robust statistics e.g., list-decodable learning and clustering, to such distributions. Our approach constructs a canonical integer program for anti-concentration and analysis a sum-of-squares relaxation of it, independent of the intended application. We rely on duality and analyze a pseudo-expectation on large subsets of the input points that take a small value in some direction. Our analysis uses the method of polynomial reweightings to reduce the problem to analyzing only analytically dense or sparse directions.

View on arXiv PDF

Similar