NELGJul 14, 2017

Simplified Long Short-term Memory Recurrent Neural Networks: part I

arXiv:1707.04619v13 citations
Originality Synthesis-oriented
AI Analysis

This work addresses efficiency for embedded platforms, but it is incremental as it modifies existing LSTM architectures without introducing a new paradigm.

The authors tackled the problem of reducing computational complexity in Long Short-term Memory (LSTM) recurrent neural networks by proposing five parameter-reduced variants, which achieved comparable accuracy to standard LSTM on the MNIST dataset while using fewer parameters and maintaining performance with ReLU nonlinearity.

We present five variants of the standard Long Short-term Memory (LSTM) recurrent neural networks by uniformly reducing blocks of adaptive parameters in the gating mechanisms. For simplicity, we refer to these models as LSTM1, LSTM2, LSTM3, LSTM4, and LSTM5, respectively. Such parameter-reduced variants enable speeding up data training computations and would be more suitable for implementations onto constrained embedded platforms. We comparatively evaluate and verify our five variant models on the classical MNIST dataset and demonstrate that these variant models are comparable to a standard implementation of the LSTM model while using less number of parameters. Moreover, we observe that in some cases the standard LSTM's accuracy performance will drop after a number of epochs when using the ReLU nonlinearity; in contrast, however, LSTM3, LSTM4 and LSTM5 will retain their performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes