MLLGJan 19, 2017

Variational Dropout Sparsifies Deep Neural Networks

arXiv:1701.05369v3889 citations
Originality Incremental advance
AI Analysis

This work addresses the need for more efficient neural networks by enabling extreme sparsity, which is incremental as it builds on existing Variational Dropout techniques.

The authors tackled the problem of sparsifying deep neural networks by extending Variational Dropout to unbounded dropout rates, achieving up to 280 times parameter reduction on LeNet and 68 times on VGG-like networks with minimal accuracy loss.

We explore a recently proposed Variational Dropout technique that provided an elegant Bayesian interpretation to Gaussian Dropout. We extend Variational Dropout to the case when dropout rates are unbounded, propose a way to reduce the variance of the gradient estimator and report first experimental results with individual dropout rates per weight. Interestingly, it leads to extremely sparse solutions both in fully-connected and convolutional layers. This effect is similar to automatic relevance determination effect in empirical Bayes but has a number of advantages. We reduce the number of parameters up to 280 times on LeNet architectures and up to 68 times on VGG-like networks with a negligible decrease of accuracy.

Code Implementations15 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes