STLGMLMay 29, 2023

Generalized equivalences between subsampling and ridge regularization

arXiv:2305.18496v27 citations
Originality Incremental advance
AI Analysis

This work addresses theoretical foundations in machine learning by clarifying relationships between subsampling and regularization, with implications for ensemble methods and risk analysis, though it is incremental in building on prior work.

The paper establishes precise asymptotic equivalences between subsampling and ridge regularization for ensemble ridge estimators, showing that linear and quadratic functionals are equivalent along specific paths in the regularization and subsample aspect ratio plane, and resolves an open problem by proving that optimally tuned ridge regression has monotonic prediction risk in the data aspect ratio under general distributions.

We establish precise structural and risk equivalences between subsampling and ridge regularization for ensemble ridge estimators. Specifically, we prove that linear and quadratic functionals of subsample ridge estimators, when fitted with different ridge regularization levels $λ$ and subsample aspect ratios $ψ$, are asymptotically equivalent along specific paths in the $(λ,ψ)$-plane (where $ψ$ is the ratio of the feature dimension to the subsample size). Our results only require bounded moment assumptions on feature and response distributions and allow for arbitrary joint distributions. Furthermore, we provide a data-dependent method to determine the equivalent paths of $(λ,ψ)$. An indirect implication of our equivalences is that optimally tuned ridge regression exhibits a monotonic prediction risk in the data aspect ratio. This resolves a recent open problem raised by Nakkiran et al. for general data distributions under proportional asymptotics, assuming a mild regularity condition that maintains regression hardness through linearized signal-to-noise ratios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes