LGCLMLDec 12, 2018

Bayesian Sparsification of Gated Recurrent Neural Networks

arXiv:1812.05692v15 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses efficiency and interpretability challenges in recurrent neural networks for machine learning practitioners, but it is incremental as it extends existing Bayesian sparsification techniques to gated architectures.

The paper tackles the problem of sparsifying gated recurrent neural networks, such as LSTM, by applying Bayesian methods to sparsify weights, neurons, and preactivations of gates, resulting in faster forward passes and improved compression, with interpretable gate sparsity patterns that vary by task.

Bayesian methods have been successfully applied to sparsify weights of neural networks and to remove structure units from the networks, e. g. neurons. We apply and further develop this approach for gated recurrent architectures. Specifically, in addition to sparsification of individual weights and neurons, we propose to sparsify preactivations of gates and information flow in LSTM. It makes some gates and information flow components constant, speeds up forward pass and improves compression. Moreover, the resulting structure of gate sparsity is interpretable and depends on the task. Code is available on github: https://github.com/tipt0p/SparseBayesianRNN

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes