LGMLNov 24, 2022

PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization

arXiv:2211.13609v186 citationsh-index: 61
Originality Incremental advance
AI Analysis

This work addresses the challenge of explaining why deep learning generalizes, which is crucial for researchers and practitioners in machine learning, though it appears incremental in improving existing bounds.

The paper tackled the problem of uninformative generalization bounds for deep neural networks by developing a compression approach based on quantizing parameters in a linear subspace, achieving state-of-the-art generalization bounds on various tasks, including transfer learning, and finding that large models can be compressed more than previously known.

While there has been progress in developing non-vacuous generalization bounds for deep neural networks, these bounds tend to be uninformative about why deep learning works. In this paper, we develop a compression approach based on quantizing neural network parameters in a linear subspace, profoundly improving on previous results to provide state-of-the-art generalization bounds on a variety of tasks, including transfer learning. We use these tight bounds to better understand the role of model size, equivariance, and the implicit biases of optimization, for generalization in deep learning. Notably, we find large models can be compressed to a much greater extent than previously known, encapsulating Occam's razor. We also argue for data-independent bounds in explaining generalization.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes