STLGMLFeb 24, 2021

On the Minimal Error of Empirical Risk Minimization

arXiv:2102.12066v17 citations
Originality Highly original
AI Analysis

This work addresses theoretical limitations of ERM for statisticians and machine learning researchers, providing sharp lower bounds that clarify adaptation to model simplicity, with implications for overparameterized models.

The paper investigates the minimal error of Empirical Risk Minimization (ERM) in regression, showing that in fixed design, error depends on global class complexity, while in random design, ERM adapts to simpler models only if local neighborhoods are nearly as complex as the class.

We study the minimal error of the Empirical Risk Minimization (ERM) procedure in the task of regression, both in the random and the fixed design settings. Our sharp lower bounds shed light on the possibility (or impossibility) of adapting to simplicity of the model generating the data. In the fixed design setting, we show that the error is governed by the global complexity of the entire class. In contrast, in random design, ERM may only adapt to simpler models if the local neighborhoods around the regression function are nearly as complex as the class itself, a somewhat counter-intuitive conclusion. We provide sharp lower bounds for performance of ERM for both Donsker and non-Donsker classes. We also discuss our results through the lens of recent studies on interpolation in overparameterized models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes