LGMLJun 22, 2020

Differentiable PAC-Bayes Objectives with Partially Aggregated Neural Networks

arXiv:2006.12228v136 citations
Originality Incremental advance
AI Analysis

This work addresses training difficulties for stochastic neural networks, offering incremental improvements in gradient estimation and bound tightness for researchers in PAC-Bayesian learning.

The paper tackles the challenge of training stochastic neural networks in a PAC-Bayesian setting by introducing partially-aggregated estimators, which enable lower-variance gradient estimates and a directly optimizable differentiable objective with a generalization guarantee that is twice as tight as prior work.

We make three related contributions motivated by the challenge of training stochastic neural networks, particularly in a PAC-Bayesian setting: (1) we show how averaging over an ensemble of stochastic neural networks enables a new class of \emph{partially-aggregated} estimators; (2) we show that these lead to provably lower-variance gradient estimates for non-differentiable signed-output networks; (3) we reformulate a PAC-Bayesian bound for these networks to derive a directly optimisable, differentiable objective and a generalisation guarantee, without using a surrogate loss or loosening the bound. This bound is twice as tight as that of Letarte et al. (2019) on a similar network type. We show empirically that these innovations make training easier and lead to competitive guarantees.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes