LGDSSTMLFeb 8, 2021

The Optimality of Polynomial Regression for Agnostic Learning under Gaussian Marginals

arXiv:2102.04401v163 citations
Originality Highly original
AI Analysis

This work provides foundational insights into the theoretical limits of agnostic learning for machine learning researchers, particularly those working on statistical query (SQ) algorithms and complexity theory.

This paper investigates agnostic learning under Gaussian distributions, demonstrating that $L^1$-regression is nearly optimal for Boolean concept classes. This finding links computational difficulty to the polynomial degree needed for $L^1$-norm approximation, leading to optimal SQ lower bounds for linear threshold functions and the first non-trivial SQ lower bounds for polynomial threshold functions and intersections of halfspaces.

We study the problem of agnostic learning under the Gaussian distribution. We develop a method for finding hard families of examples for a wide class of problems by using LP duality. For Boolean-valued concept classes, we show that the $L^1$-regression algorithm is essentially best possible, and therefore that the computational difficulty of agnostically learning a concept class is closely related to the polynomial degree required to approximate any function from the class in $L^1$-norm. Using this characterization along with additional analytic tools, we obtain optimal SQ lower bounds for agnostically learning linear threshold functions and the first non-trivial SQ lower bounds for polynomial threshold functions and intersections of halfspaces. We also develop an analogous theory for agnostically learning real-valued functions, and as an application prove near-optimal SQ lower bounds for agnostically learning ReLUs and sigmoids.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes