LGMLMar 16, 2018

Reviving and Improving Recurrent Back-Propagation

arXiv:1803.06396v4139 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses computational bottlenecks in training recurrent neural networks, offering memory-efficient alternatives for researchers and practitioners in machine learning, though it is incremental as it builds on existing RBP methods.

The paper tackles the instability and inefficiency of recurrent back-propagation (RBP) by proposing two variants, CG-RBP and Neumann-RBP, with Neumann-RBP achieving constant memory usage compared to linear scaling in truncated back-propagation through time (TBPTT). Experiments in domains like associative memory and document classification show that these RBP variants, especially Neumann-RBP, are efficient and effective for optimizing convergent recurrent neural networks.

In this paper, we revisit the recurrent back-propagation (RBP) algorithm, discuss the conditions under which it applies as well as how to satisfy them in deep neural networks. We show that RBP can be unstable and propose two variants based on conjugate gradient on the normal equations (CG-RBP) and Neumann series (Neumann-RBP). We further investigate the relationship between Neumann-RBP and back propagation through time (BPTT) and its truncated version (TBPTT). Our Neumann-RBP has the same time complexity as TBPTT but only requires constant memory, whereas TBPTT's memory cost scales linearly with the number of truncation steps. We examine all RBP variants along with BPTT and TBPTT in three different application domains: associative memory with continuous Hopfield networks, document classification in citation networks using graph neural networks and hyperparameter optimization for fully connected networks. All experiments demonstrate that RBPs, especially the Neumann-RBP variant, are efficient and effective for optimizing convergent recurrent neural networks. Code is released at: \url{https://github.com/lrjconan/RBP}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes