AICVAug 22, 2017

Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks

arXiv:1708.06834v3234 citations
Originality Incremental advance
AI Analysis

This addresses efficiency issues in sequence modeling for applications like NLP or time-series analysis, but it is an incremental improvement over existing RNN methods.

The paper tackles the problem of slow inference and vanishing gradients in RNNs on long sequences by introducing Skip RNN, which learns to skip state updates, reducing the number of required updates while preserving or improving performance compared to baseline RNNs.

Recurrent Neural Networks (RNNs) continue to show outstanding performance in sequence modeling tasks. However, training RNNs on long sequences often face challenges like slow inference, vanishing gradients and difficulty in capturing long term dependencies. In backpropagation through time settings, these issues are tightly coupled with the large, sequential computational graph resulting from unfolding the RNN in time. We introduce the Skip RNN model which extends existing RNN models by learning to skip state updates and shortens the effective size of the computational graph. This model can also be encouraged to perform fewer state updates through a budget constraint. We evaluate the proposed model on various tasks and show how it can reduce the number of required RNN updates while preserving, and sometimes even improving, the performance of the baseline RNN models. Source code is publicly available at https://imatge-upc.github.io/skiprnn-2017-telecombcn/ .

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes