LGMLFeb 10, 2021

Improved Algorithms for Efficient Active Learning Halfspaces with Massart and Tsybakov noise

arXiv:2102.05312v225 citations
Originality Incremental advance
AI Analysis

This work addresses efficient active learning for noisy data, offering incremental improvements in label complexity for specific noise models.

The paper tackles the problem of active learning for halfspaces under Massart and Tsybakov noise, achieving a near-optimal label complexity of ˜O(d/(1-2η)^2 polylog(1/ε)) for Massart noise and improved guarantees over passive learning for certain Tsybakov noise conditions.

We give a computationally-efficient PAC active learning algorithm for $d$-dimensional homogeneous halfspaces that can tolerate Massart noise (Massart and Nédélec, 2006) and Tsybakov noise (Tsybakov, 2004). Specialized to the $η$-Massart noise setting, our algorithm achieves an information-theoretically near-optimal label complexity of $\tilde{O}\left( \frac{d}{(1-2η)^2} \mathrm{polylog}(\frac1ε) \right)$ under a wide range of unlabeled data distributions (specifically, the family of "structured distributions" defined in Diakonikolas et al. (2020)). Under the more challenging Tsybakov noise condition, we identify two subfamilies of noise conditions, under which our efficient algorithm provides label complexity guarantees strictly lower than passive learning algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes