LGDSMLJun 29, 2020

Statistical-Query Lower Bounds via Functional Gradients

arXiv:2006.15812v270 citations
AI Analysis

This work addresses a foundational challenge in machine learning theory by ruling out general statistical-query algorithms for real-valued learning problems, which is unusual and impacts theoretical understanding of neural network learnability.

The paper tackles the problem of establishing statistical-query lower bounds for agnostically learning non-polynomial activations like ReLU and sigmoid under Gaussian marginals, showing that any algorithm requires at least 2^(n^c) ε queries for ReLU regression with tolerance n^{-(1/ε)^b}, where n is dimension and ε is accuracy.

We give the first statistical-query lower bounds for agnostically learning any non-polynomial activation with respect to Gaussian marginals (e.g., ReLU, sigmoid, sign). For the specific problem of ReLU regression (equivalently, agnostically learning a ReLU), we show that any statistical-query algorithm with tolerance $n^{-(1/ε)^b}$ must use at least $2^{n^c} ε$ queries for some constant $b, c > 0$, where $n$ is the dimension and $ε$ is the accuracy parameter. Our results rule out general (as opposed to correlational) SQ learning algorithms, which is unusual for real-valued learning problems. Our techniques involve a gradient boosting procedure for "amplifying" recent lower bounds due to Diakonikolas et al. (COLT 2020) and Goel et al. (ICML 2020) on the SQ dimension of functions computed by two-layer neural networks. The crucial new ingredient is the use of a nonstandard convex functional during the boosting procedure. This also yields a best-possible reduction between two commonly studied models of learning: agnostic learning and probabilistic concepts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes