LGMLSep 9, 2019

An Adaptive Stochastic Nesterov Accelerated Quasi Newton Method for Training RNNs

arXiv:1909.03620v1
Originality Incremental advance
AI Analysis

This addresses training challenges for RNNs, but appears incremental as it builds on existing acceleration and quasi-Newton techniques.

The paper tackles the vanishing/exploding gradient problem in training Recurrent Neural Networks (RNNs) by proposing an adaptive stochastic Nesterov accelerated quasi-Newton method, which shows improved performance on benchmark sequence modeling tasks while maintaining low per-iteration cost.

A common problem in training neural networks is the vanishing and/or exploding gradient problem which is more prominently seen in training of Recurrent Neural Networks (RNNs). Thus several algorithms have been proposed for training RNNs. This paper proposes a novel adaptive stochastic Nesterov accelerated quasiNewton (aSNAQ) method for training RNNs. The proposed method aSNAQ is an accelerated method that uses the Nesterov's gradient term along with second order curvature information. The performance of the proposed method is evaluated in Tensorflow on benchmark sequence modeling problems. The results show an improved performance while maintaining a low per-iteration cost and thus can be effectively used to train RNNs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes