LGMar 11, 2023

Resurrecting Recurrent Neural Networks for Long Sequences

DeepMind
arXiv:2303.06349v1493 citationsh-index: 75
Originality Incremental advance
AI Analysis

This work addresses the challenge of optimizing RNNs for long-range reasoning tasks, offering a competitive alternative to state-space models for researchers and practitioners in sequence modeling.

The paper tackled the problem of improving recurrent neural networks (RNNs) for long sequences by showing that careful design can match the performance and training speed of deep state-space models, achieving competitive results on the Long Range Arena benchmark.

Recurrent Neural Networks (RNNs) offer fast inference on long sequences but are hard to optimize and slow to train. Deep state-space models (SSMs) have recently been shown to perform remarkably well on long sequence modeling tasks, and have the added benefits of fast parallelizable training and RNN-like fast inference. However, while SSMs are superficially similar to RNNs, there are important differences that make it unclear where their performance boost over RNNs comes from. In this paper, we show that careful design of deep RNNs using standard signal propagation arguments can recover the impressive performance of deep SSMs on long-range reasoning tasks, while also matching their training speed. To achieve this, we analyze and ablate a series of changes to standard RNNs including linearizing and diagonalizing the recurrence, using better parameterizations and initializations, and ensuring proper normalization of the forward pass. Our results provide new insights on the origins of the impressive performance of deep SSMs, while also introducing an RNN block called the Linear Recurrent Unit that matches both their performance on the Long Range Arena benchmark and their computational efficiency.

Code Implementations11 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes