NEAug 11, 2015

Benchmarking of LSTM Networks

arXiv:1508.02774v154 citations
Originality Synthesis-oriented
AI Analysis

This provides incremental insights for researchers using LSTM networks in sequence tasks.

The paper tackled benchmarking LSTM networks on MNIST and UW3 databases, finding that bidirectional training with CTC outperforms other methods and performance depends smoothly on learning rates.

LSTM (Long Short-Term Memory) recurrent neural networks have been highly successful in a number of application areas. This technical report describes the use of the MNIST and UW3 databases for benchmarking LSTM networks and explores the effect of different architectural and hyperparameter choices on performance. Significant findings include: (1) LSTM performance depends smoothly on learning rates, (2) batching and momentum has no significant effect on performance, (3) softmax training outperforms least square training, (4) peephole units are not useful, (5) the standard non-linearities (tanh and sigmoid) perform best, (6) bidirectional training combined with CTC performs better than other methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes