Tighter Information-Theoretic Generalization Bounds from Supersamples
This work provides incremental improvements in generalization bounds for machine learning theory, addressing a specific theoretical bottleneck.
The authors tackled the problem of deriving tighter information-theoretic generalization bounds for learning algorithms in the supersample setting, achieving bounds that are theoretically or empirically tighter than all previous ones in this context.
In this work, we present a variety of novel information-theoretic generalization bounds for learning algorithms, from the supersample setting of Steinke & Zakynthinou (2020)-the setting of the "conditional mutual information" framework. Our development exploits projecting the loss pair (obtained from a training instance and a testing instance) down to a single number and correlating loss values with a Rademacher sequence (and its shifted variants). The presented bounds include square-root bounds, fast-rate bounds, including those based on variance and sharpness, and bounds for interpolating algorithms etc. We show theoretically or empirically that these bounds are tighter than all information-theoretic bounds known to date on the same supersample setting.