ITLGMLApr 20, 2020

Generalization Error Bounds via $m$th Central Moments of the Information Density

arXiv:2004.09148v27 citations
AI Analysis

This work provides incremental theoretical advances in generalization error bounds for machine learning, particularly relevant for researchers in statistical learning theory.

The authors tackled the problem of bounding generalization error for randomized learning algorithms, presenting a novel bound based on central moments of information density that shows milder dependence on confidence level as higher moments are controlled, and a second bound using binary hypothesis testing that confirms faster tail decay improves confidence dependence.

We present a general approach to deriving bounds on the generalization error of randomized learning algorithms. Our approach can be used to obtain bounds on the average generalization error as well as bounds on its tail probabilities, both for the case in which a new hypothesis is randomly generated every time the algorithm is used - as often assumed in the probably approximately correct (PAC)-Bayesian literature - and in the single-draw case, where the hypothesis is extracted only once. For this last scenario, we present a novel bound that is explicit in the central moments of the information density. The bound reveals that the higher the order of the information density moment that can be controlled, the milder the dependence of the generalization bound on the desired confidence level. Furthermore, we use tools from binary hypothesis testing to derive a second bound, which is explicit in the tail of the information density. This bound confirms that a fast decay of the tail of the information density yields a more favorable dependence of the generalization bound on the confidence level.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes