MLLGJun 8, 2022

Boosting the Confidence of Generalization for $L_2$-Stable Randomized Learning Algorithms

arXiv:2206.03834v13 citationsh-index: 40
Originality Highly original
AI Analysis

This work addresses a foundational problem in machine learning theory by bridging the gap between stability regimes, offering improved generalization guarantees for algorithms like SGD.

The paper tackles the tension between uniform stability and weaker distribution-dependent stability notions by establishing an in-expectation generalization error bound for $L_2$-stable randomized algorithms and showing that subbagging yields near-tight exponential generalization bounds, with applications to SGD improving high-probability bounds for convex and non-convex problems.

Exponential generalization bounds with near-tight rates have recently been established for uniformly stable learning algorithms. The notion of uniform stability, however, is stringent in the sense that it is invariant to the data-generating distribution. Under the weaker and distribution dependent notions of stability such as hypothesis stability and $L_2$-stability, the literature suggests that only polynomial generalization bounds are possible in general cases. The present paper addresses this long standing tension between these two regimes of results and makes progress towards relaxing it inside a classic framework of confidence-boosting. To this end, we first establish an in-expectation first moment generalization error bound for potentially randomized learning algorithms with $L_2$-stability, based on which we then show that a properly designed subbagging process leads to near-tight exponential generalization bounds over the randomness of both data and algorithm. We further substantialize these generic results to stochastic gradient descent (SGD) to derive improved high-probability generalization bounds for convex or non-convex optimization problems with natural time decaying learning rates, which have not been possible to prove with the existing hypothesis stability or uniform stability based results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes