LGMLMay 29, 2019

Rethinking Full Connectivity in Recurrent Neural Networks

arXiv:1905.12340v16 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of high computational costs for RNNs in low-power devices, offering a more efficient alternative for real-world sequence modeling tasks, though it is incremental as it builds on existing RNN architectures.

The paper tackles the computational and memory burden of fully connected recurrent neural networks (RNNs) by proposing structurally sparse RNNs, which achieve competitive performance on tasks like language modeling and speech recognition while reducing recurrent weights and enabling acceleration on parallel hardware.

Recurrent neural networks (RNNs) are omnipresent in sequence modeling tasks. Practical models usually consist of several layers of hundreds or thousands of neurons which are fully connected. This places a heavy computational and memory burden on hardware, restricting adoption in practical low-cost and low-power devices. Compared to fully convolutional models, the costly sequential operation of RNNs severely hinders performance on parallel hardware. This paper challenges the convention of full connectivity in RNNs. We study structurally sparse RNNs, showing that they are well suited for acceleration on parallel hardware, with a greatly reduced cost of the recurrent operations as well as orders of magnitude less recurrent weights. Extensive experiments on challenging tasks ranging from language modeling and speech recognition to video action recognition reveal that structurally sparse RNNs achieve competitive performance as compared to fully-connected networks. This allows for using large sparse RNNs for a wide range of real-world tasks that previously were too costly with fully connected networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes