LG NE MLNov 1, 2018

Stochastic Normalizations as Bayesian Learning

arXiv:1811.00639v17.116 citations

Originality Incremental advance

AI Analysis

This work addresses the generalization problem in deep learning by providing a Bayesian perspective on normalization, offering a method to enhance performance and uncertainty estimation for practitioners.

The paper investigates why Batch Normalization improves generalization in deep networks, attributing it to randomness in batch statistics, and applies this Bayesian learning interpretation to other normalization methods, achieving comparable test performance and better validation losses for uncertainty estimation.

In this work we investigate the reasons why Batch Normalization (BN) improves the generalization performance of deep networks. We argue that one major reason, distinguishing it from data-independent normalization methods, is randomness of batch statistics. This randomness appears in the parameters rather than in activations and admits an interpretation as a practical Bayesian learning. We apply this idea to other (deterministic) normalization techniques that are oblivious to the batch size. We show that their generalization performance can be improved significantly by Bayesian learning of the same form. We obtain test performance comparable to BN and, at the same time, better validation losses suitable for subsequent output uncertainty estimation through approximate Bayesian posterior.

View on arXiv PDF

Similar