LG DS ST MLJun 29, 2020

Near-Optimal SQ Lower Bounds for Agnostically Learning Halfspaces and ReLUs under Gaussian Marginals

Ilias Diakonikolas, Daniel M. Kane, Nikos Zarifis

arXiv:2006.16200v120.477 citations

Originality Highly original

AI Analysis

This work addresses fundamental challenges in machine learning theory for researchers, providing evidence that current algorithms for these tasks may be as efficient as possible under the Statistical Query model.

The paper tackles the problem of agnostically learning halfspaces and ReLUs under Gaussian marginals, proving Statistical Query lower bounds of d^poly(1/ε) for both tasks, which suggests that existing upper bounds are nearly optimal.

We study the fundamental problems of agnostically learning halfspaces and ReLUs under Gaussian marginals. In the former problem, given labeled examples $(\mathbf{x}, y)$ from an unknown distribution on $\mathbb{R}^d \times \{ \pm 1\}$, whose marginal distribution on $\mathbf{x}$ is the standard Gaussian and the labels $y$ can be arbitrary, the goal is to output a hypothesis with 0-1 loss $\mathrm{OPT}+ε$, where $\mathrm{OPT}$ is the 0-1 loss of the best-fitting halfspace. In the latter problem, given labeled examples $(\mathbf{x}, y)$ from an unknown distribution on $\mathbb{R}^d \times \mathbb{R}$, whose marginal distribution on $\mathbf{x}$ is the standard Gaussian and the labels $y$ can be arbitrary, the goal is to output a hypothesis with square loss $\mathrm{OPT}+ε$, where $\mathrm{OPT}$ is the square loss of the best-fitting ReLU. We prove Statistical Query (SQ) lower bounds of $d^{\mathrm{poly}(1/ε)}$ for both of these problems. Our SQ lower bounds provide strong evidence that current upper bounds for these tasks are essentially best possible.

View on arXiv PDF

Similar