MLLGSTOct 6, 2021

Foolish Crowds Support Benign Overfitting

arXiv:2110.02914v524 citations
Originality Incremental advance
AI Analysis

This addresses the problem of understanding overfitting risks in machine learning for researchers, revealing a counterintuitive 'foolish crowd' effect that is incremental to existing theory.

The paper proves a lower bound on excess risk for sparse interpolating procedures in overparameterized linear regression, showing that basis pursuit can converge exponentially slower than OLS even with sparse ground truth.

We prove a lower bound on the excess risk of sparse interpolating procedures for linear regression with Gaussian data in the overparameterized regime. We apply this result to obtain a lower bound for basis pursuit (the minimum $\ell_1$-norm interpolant) that implies that its excess risk can converge at an exponentially slower rate than OLS (the minimum $\ell_2$-norm interpolant), even when the ground truth is sparse. Our analysis exposes the benefit of an effect analogous to the "wisdom of the crowd", except here the harm arising from fitting the $\textit{noise}$ is ameliorated by spreading it among many directions -- the variance reduction arises from a $\textit{foolish}$ crowd.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes