Information-theoretic generalization bounds for black-box learning algorithms
This work addresses the challenge of providing practical generalization guarantees for black-box learning algorithms, particularly in deep learning, though it is incremental as it builds on existing information-theoretic bounds.
The paper tackles the problem of deriving generalization bounds for supervised learning algorithms by using information in predictions rather than training outputs, resulting in bounds that are more applicable, meaningful for deterministic algorithms, and easier to estimate, with experimental validation showing they closely follow the generalization gap in deep learning.
We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.