ML LGJun 17, 2021

Wide stochastic networks: Gaussian limit and PAC-Bayesian training

Eugenio Clerico, George Deligiannidis, Arnaud Doucet

arXiv:2106.09798v313.212 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of improving generalization in stochastic neural networks for machine learning practitioners, but it is incremental as it extends existing infinite-width limit results to a stochastic architecture.

The paper tackles the problem of analyzing and training over-parameterized stochastic neural networks by establishing a Gaussian limit for infinite-width networks, both before and during training, and uses this to develop a PAC-Bayesian training procedure that directly optimizes generalization bounds. The result shows that this approach empirically outperforms standard PAC-Bayesian methods on MNIST, though specific numerical gains are not detailed.

The limit of infinite width allows for substantial simplifications in the analytical study of over-parameterised neural networks. With a suitable random initialisation, an extremely large network exhibits an approximately Gaussian behaviour. In the present work, we establish a similar result for a simple stochastic architecture whose parameters are random variables, holding both before and during training. The explicit evaluation of the output distribution allows for a PAC-Bayesian training procedure that directly optimises the generalisation bound. For a large but finite-width network, we show empirically on MNIST that this training approach can outperform standard PAC-Bayesian methods.

View on arXiv PDF Code

Similar