LG DS ST MLJan 16, 2025

A Near-optimal Algorithm for Learning Margin Halfspaces with Massart Noise

arXiv:2501.09691v111.47 citationsh-index: 14NIPS

Originality Incremental advance

AI Analysis

This addresses a fundamental challenge in machine learning theory for researchers, providing near-optimal sample complexity in a noisy setting, though it is incremental relative to prior work.

The paper tackles the problem of PAC learning margin halfspaces with Massart noise, achieving a computationally efficient algorithm with sample complexity nearly matching a conjectured lower bound, specifically Φ(1/(γ^2 ε^2)).

We study the problem of PAC learning $γ$-margin halfspaces in the presence of Massart noise. Without computational considerations, the sample complexity of this learning problem is known to be $\widetildeΘ(1/(γ^2 ε))$. Prior computationally efficient algorithms for the problem incur sample complexity $\tilde{O}(1/(γ^4 ε^3))$ and achieve 0-1 error of $η+ε$, where $η<1/2$ is the upper bound on the noise rate. Recent work gave evidence of an information-computation tradeoff, suggesting that a quadratic dependence on $1/ε$ is required for computationally efficient algorithms. Our main result is a computationally efficient learner with sample complexity $\widetildeΘ(1/(γ^2 ε^2))$, nearly matching this lower bound. In addition, our algorithm is simple and practical, relying on online SGD on a carefully selected sequence of convex losses.

View on arXiv PDF

Similar