STLGMLFeb 23, 2022

Exponential Tail Local Rademacher Complexity Risk Bounds Without the Bernstein Condition

arXiv:2202.11461v111 citations
Originality Highly original
AI Analysis

This work addresses a foundational problem in statistical learning theory by extending risk bounds to non-convex and improper estimators, such as those in model selection, which were previously outside the reach of classical localization methods.

The paper tackles the limitation of the classical local Rademacher complexity framework, which relies on the Bernstein condition and restricts applicability to convex and proper settings, by developing an exponential-tail excess risk bound using offset Rademacher complexities that applies under an estimator-dependent geometric condition, achieving results as sharp as the classical theory.

The local Rademacher complexity framework is one of the most successful general-purpose toolboxes for establishing sharp excess risk bounds for statistical estimators based on the framework of empirical risk minimization. Applying this toolbox typically requires using the Bernstein condition, which often restricts applicability to convex and proper settings. Recent years have witnessed several examples of problems where optimal statistical performance is only achievable via non-convex and improper estimators originating from aggregation theory, including the fundamental problem of model selection. These examples are currently outside of the reach of the classical localization theory. In this work, we build upon the recent approach to localization via offset Rademacher complexities, for which a general high-probability theory has yet to be established. Our main result is an exponential-tail excess risk bound expressed in terms of the offset Rademacher complexity that yields results at least as sharp as those obtainable via the classical theory. However, our bound applies under an estimator-dependent geometric condition (the "offset condition") instead of the estimator-independent (but, in general, distribution-dependent) Bernstein condition on which the classical theory relies. Our results apply to improper prediction regimes not directly covered by the classical theory.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes