LG IT STDec 19, 2023

Generalization Analysis of Machine Learning Algorithms via the Worst-Case Data-Generating Probability Measure

Xinying Zou, Samir M. Perlaza, Iñaki Esnaola, Eitan Altman

arXiv:2312.12236v17.728 citationsh-index: 25AAAI

Originality Incremental advance

AI Analysis

This provides a theoretical framework for understanding generalization, but it is incremental as it builds on prior work like the Gibbs algorithm.

The paper introduces the worst-case probability measure as a tool to analyze generalization in machine learning, showing that key generalization metrics have closed-form expressions involving this measure and recovering existing results for the Gibbs algorithm.

In this paper, the worst-case probability measure over the data is introduced as a tool for characterizing the generalization capabilities of machine learning algorithms. More specifically, the worst-case probability measure is a Gibbs probability measure and the unique solution to the maximization of the expected loss under a relative entropy constraint with respect to a reference probability measure. Fundamental generalization metrics, such as the sensitivity of the expected loss, the sensitivity of the empirical risk, and the generalization gap are shown to have closed-form expressions involving the worst-case data-generating probability measure. Existing results for the Gibbs algorithm, such as characterizing the generalization gap as a sum of mutual information and lautum information, up to a constant factor, are recovered. A novel parallel is established between the worst-case data-generating probability measure and the Gibbs algorithm. Specifically, the Gibbs probability measure is identified as a fundamental commonality of the model space and the data space for machine learning algorithms.

View on arXiv PDF

Similar