LGNEJul 11, 2016

Recurrent Memory Array Structures

arXiv:1607.03085v318 citations
AI Analysis

This work addresses text prediction performance for natural language processing applications, representing an incremental improvement over existing LSTM architectures.

The authors tackled the problem of improving LSTM generalization by augmenting it with multiple memory cells per hidden unit, achieving state-of-the-art performance of 1.402 BPC on enwik8 and establishing baseline results of 1.12 BPC and 1.19 BPC on enwik9 and enwik10.

The following report introduces ideas augmenting standard Long Short Term Memory (LSTM) architecture with multiple memory cells per hidden unit in order to improve its generalization capabilities. It considers both deterministic and stochastic variants of memory operation. It is shown that the nondeterministic Array-LSTM approach improves state-of-the-art performance on character level text prediction achieving 1.402 BPC on enwik8 dataset. Furthermore, this report estabilishes baseline neural-based results of 1.12 BPC and 1.19 BPC for enwik9 and enwik10 datasets respectively.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes