A Theory of Universal Agnostic Learning
This provides a foundational framework for understanding learning limits in agnostic scenarios, impacting theoretical machine learning.
The paper extends the theory of optimal universal rates for binary classification from the realizable to the agnostic setting, identifying a tetrachotomy of rates (e^{-n}, e^{-o(n)}, o(n^{-1/2}), or arbitrarily slow) and linking them to combinatorial structures of concept classes.
We provide a complete theory of optimal universal rates for binary classification in the agnostic setting. This extends the realizable-case theory of Bousquet, Hanneke, Moran, van Handel, and Yehudayoff (2021) by removing the realizability assumption on the distribution. We identify a fundamental tetrachotomy of optimal rates: for every concept class, the optimal universal rate of convergence of the excess error rate is one of $e^{-n}$, $e^{-o(n)}$, $o(n^{-1/2})$, or arbitrarily slow. We further identify simple combinatorial structures which determine which of these categories any given concept class falls into.