Revisiting Agnostic Boosting
This work addresses a key gap in statistical learning for scenarios with arbitrary label distributions, providing foundational insights with broad implications for boosting methods.
The paper tackles the problem of agnostic boosting, where label distributions have no assumptions, by proposing a new algorithm that achieves substantially improved sample complexity compared to prior works, with a nearly-matching lower bound settling the complexity up to logarithmic factors.
Boosting is a key method in statistical learning, allowing for converting weak learners into strong ones. While well studied in the realizable case, the statistical properties of weak-to-strong learning remain less understood in the agnostic setting, where there are no assumptions on the distribution of the labels. In this work, we propose a new agnostic boosting algorithm with substantially improved sample complexity compared to prior works under very general assumptions. Our approach is based on a reduction to the realizable case, followed by a margin-based filtering of high-quality hypotheses. Furthermore, we show a nearly-matching lower bound, settling the sample complexity of agnostic boosting up to logarithmic factors.