MLITLGSTFeb 7, 2023

A unified recipe for deriving (time-uniform) PAC-Bayes bounds

CMU
arXiv:2302.03421v535 citationsh-index: 45
AI Analysis

This work provides a foundational advancement for machine learning theory by unifying and extending PAC-Bayes bounds to be anytime-valid, which is crucial for applications involving sequential or adaptive data analysis.

The paper tackles the problem of deriving PAC-Bayesian generalization bounds that are time-uniform, meaning they hold at all stopping times rather than just fixed sample sizes, and presents a unified framework that yields such bounds for nonstationary loss functions and non-i.i.d. data, implying time-uniform versions of classical bounds and enabling easier derivation of future bounds.

We present a unified framework for deriving PAC-Bayesian generalization bounds. Unlike most previous literature on this topic, our bounds are anytime-valid (i.e., time-uniform), meaning that they hold at all stopping times, not only for a fixed sample size. Our approach combines four tools in the following order: (a) nonnegative supermartingales or reverse submartingales, (b) the method of mixtures, (c) the Donsker-Varadhan formula (or other convex duality principles), and (d) Ville's inequality. Our main result is a PAC-Bayes theorem which holds for a wide class of discrete stochastic processes. We show how this result implies time-uniform versions of well-known classical PAC-Bayes bounds, such as those of Seeger, McAllester, Maurer, and Catoni, in addition to many recent bounds. We also present several novel bounds. Our framework also enables us to relax traditional assumptions; in particular, we consider nonstationary loss functions and non-i.i.d. data. In sum, we unify the derivation of past bounds and ease the search for future bounds: one may simply check if our supermartingale or submartingale conditions are met and, if so, be guaranteed a (time-uniform) PAC-Bayes bound.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes