LGCLMLNov 13, 2019

Structured Sparsification of Gated Recurrent Neural Networks

arXiv:1911.05585v13 citations
Originality Synthesis-oriented
AI Analysis

This work provides an incremental improvement for researchers and practitioners in natural language processing by adapting existing sparsification techniques to gated recurrent architectures.

The authors tackled the problem of compressing gated recurrent neural networks by sparsifying weights, neurons, and gate preactivations, which simplifies LSTM structures and improves neuron-wise compression in most text classification and language modeling tasks.

Recently, a lot of techniques were developed to sparsify the weights of neural networks and to remove networks' structure units, e.g. neurons. We adjust the existing sparsification approaches to the gated recurrent architectures. Specifically, in addition to the sparsification of weights and neurons, we propose sparsifying the preactivations of gates. This makes some gates constant and simplifies LSTM structure. We test our approach on the text classification and language modeling tasks. We observe that the resulting structure of gate sparsity depends on the task and connect the learned structure to the specifics of the particular tasks. Our method also improves neuron-wise compression of the model in most of the tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes