LGMLMar 12, 2018

Flipout: Efficient Pseudo-Independent Weight Perturbations on Mini-Batches

arXiv:1803.04386v2345 citations
Originality Highly original
AI Analysis

This addresses a bottleneck in training stochastic neural networks for applications like regularization, Bayesian neural nets, and reinforcement learning, offering significant efficiency gains.

The paper tackles the problem of correlated weight perturbations in stochastic neural networks, which limits variance reduction in large mini-batches, by introducing flipout, a method that decorrelates gradients by sampling pseudo-independent perturbations per example. Empirically, flipout achieves linear variance reduction across network types, speeds up training with multiplicative Gaussian perturbations, outperforms previous methods in regularizing LSTMs, and enables vectorized evolution strategies with a factor-of-4 cost reduction on AWS.

Stochastic neural net weights are used in a variety of contexts, including regularization, Bayesian neural nets, exploration in reinforcement learning, and evolution strategies. Unfortunately, due to the large number of weights, all the examples in a mini-batch typically share the same weight perturbation, thereby limiting the variance reduction effect of large mini-batches. We introduce flipout, an efficient method for decorrelating the gradients within a mini-batch by implicitly sampling pseudo-independent weight perturbations for each example. Empirically, flipout achieves the ideal linear variance reduction for fully connected networks, convolutional networks, and RNNs. We find significant speedups in training neural networks with multiplicative Gaussian perturbations. We show that flipout is effective at regularizing LSTMs, and outperforms previous methods. Flipout also enables us to vectorize evolution strategies: in our experiments, a single GPU with flipout can handle the same throughput as at least 40 CPU cores using existing methods, equivalent to a factor-of-4 cost reduction on Amazon Web Services.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes