LGDSSTMLOct 18, 2023

SQ Lower Bounds for Learning Mixtures of Linear Classifiers

arXiv:2310.11876v14 citationsh-index: 48
Originality Incremental advance
AI Analysis

This provides theoretical limits for a fundamental problem in machine learning, indicating incremental progress by establishing lower bounds for existing methods.

The paper tackles the problem of learning mixtures of linear classifiers under Gaussian covariates, showing that known algorithms are essentially optimal by proving a Statistical Query lower bound of complexity n^(poly(1/Δ) log(r)), where Δ is the separation between vectors.

We study the problem of learning mixtures of linear classifiers under Gaussian covariates. Given sample access to a mixture of $r$ distributions on $\mathbb{R}^n$ of the form $(\mathbf{x},y_{\ell})$, $\ell\in [r]$, where $\mathbf{x}\sim\mathcal{N}(0,\mathbf{I}_n)$ and $y_\ell=\mathrm{sign}(\langle\mathbf{v}_\ell,\mathbf{x}\rangle)$ for an unknown unit vector $\mathbf{v}_\ell$, the goal is to learn the underlying distribution in total variation distance. Our main result is a Statistical Query (SQ) lower bound suggesting that known algorithms for this problem are essentially best possible, even for the special case of uniform mixtures. In particular, we show that the complexity of any SQ algorithm for the problem is $n^{\mathrm{poly}(1/Δ) \log(r)}$, where $Δ$ is a lower bound on the pairwise $\ell_2$-separation between the $\mathbf{v}_\ell$'s. The key technical ingredient underlying our result is a new construction of spherical designs that may be of independent interest.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes