LGMLFeb 9, 2023

Information Theoretic Lower Bounds for Information Theoretic Upper Bounds

arXiv:2302.04925v219 citationsh-index: 20
Originality Incremental advance
AI Analysis

This work highlights a limitation in current theoretical frameworks for understanding generalization in machine learning, showing that existing bounds are insufficient for algorithms widely used in practice.

The paper tackles the problem of whether information-theoretic generalization bounds can explain the performance of learning algorithms like SGD, finding that for stochastic convex optimization, dimension-dependent mutual information is required, which fails to capture the dimension-independent sample complexity of such algorithms.

We examine the relationship between the mutual information between the output model and the empirical sample and the generalization of the algorithm in the context of stochastic convex optimization. Despite increasing interest in information-theoretic generalization bounds, it is uncertain if these bounds can provide insight into the exceptional performance of various learning algorithms. Our study of stochastic convex optimization reveals that, for true risk minimization, dimension-dependent mutual information is necessary. This indicates that existing information-theoretic generalization bounds fall short in capturing the generalization capabilities of algorithms like SGD and regularized ERM, which have dimension-independent sample complexity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes