LGMLSep 6, 2019

Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field Approximation

arXiv:1909.03009v28 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a foundational issue in machine learning theory for researchers, but it is incremental as it critiques an existing approach without proposing a new solution.

The paper tackles the problem of explaining generalization in overparametrized neural networks by evaluating PAC-Bayes bounds optimized with variational inference, finding that using a mean-field Gaussian posterior yields negligible improvements in non-vacuous bounds.

Explaining how overparametrized neural networks simultaneously achieve low risk and zero empirical risk on benchmark datasets is an open problem. PAC-Bayes bounds optimized using variational inference (VI) have been recently proposed as a promising direction in obtaining non-vacuous bounds. We show empirically that this approach gives negligible gains when modeling the posterior as a Gaussian with diagonal covariance--known as the mean-field approximation. We investigate common explanations, such as the failure of VI due to problems in optimization or choosing a suboptimal prior. Our results suggest that investigating richer posteriors is the most promising direction forward.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes