LGDSMLJun 22, 2020

Lipschitz Recurrent Neural Networks

arXiv:2006.12070v3134 citations
Originality Incremental advance
AI Analysis

This work addresses stability and robustness issues in RNNs for applications such as computer vision and language modeling, representing an incremental improvement with a novel architectural design.

The authors tackled the problem of designing stable recurrent neural networks by proposing a Lipschitz RNN unit with a linear component and Lipschitz nonlinearity, which outperformed existing units on benchmarks like computer vision and language modeling and showed improved robustness to perturbations.

Viewing recurrent neural networks (RNNs) as continuous-time dynamical systems, we propose a recurrent unit that describes the hidden state's evolution with two parts: a well-understood linear component plus a Lipschitz nonlinearity. This particular functional form facilitates stability analysis of the long-term behavior of the recurrent unit using tools from nonlinear systems theory. In turn, this enables architectural design decisions before experimentation. Sufficient conditions for global stability of the recurrent unit are obtained, motivating a novel scheme for constructing hidden-to-hidden matrices. Our experiments demonstrate that the Lipschitz RNN can outperform existing recurrent units on a range of benchmark tasks, including computer vision, language modeling and speech prediction tasks. Finally, through Hessian-based analysis we demonstrate that our Lipschitz recurrent unit is more robust with respect to input and parameter perturbations as compared to other continuous-time RNNs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes