LG CL MLNov 13, 2019

Structured Sparsification of Gated Recurrent Neural Networks

Ekaterina Lobacheva, Nadezhda Chirkova, Alexander Markovich, Dmitry Vetrov

arXiv:1911.05585v13.43 citations

Originality Synthesis-oriented

AI Analysis

This work provides an incremental improvement for researchers and practitioners in natural language processing by adapting existing sparsification techniques to gated recurrent architectures.

The authors tackled the problem of compressing gated recurrent neural networks by sparsifying weights, neurons, and gate preactivations, which simplifies LSTM structures and improves neuron-wise compression in most text classification and language modeling tasks.

Recently, a lot of techniques were developed to sparsify the weights of neural networks and to remove networks' structure units, e.g. neurons. We adjust the existing sparsification approaches to the gated recurrent architectures. Specifically, in addition to the sparsification of weights and neurons, we propose sparsifying the preactivations of gates. This makes some gates constant and simplifies LSTM structure. We test our approach on the text classification and language modeling tasks. We observe that the resulting structure of gate sparsity depends on the task and connect the learned structure to the specifics of the particular tasks. Our method also improves neuron-wise compression of the model in most of the tasks.

View on arXiv PDF

Similar