CLFLLGJun 9, 2019

LSTM Networks Can Perform Dynamic Counting

arXiv:1906.03648v11121 citations
Originality Incremental advance
AI Analysis

This work addresses the computational power of neural networks for formal language recognition, introducing shuffle languages for analysis, but it is incremental as it builds on existing studies of LSTM capabilities.

The paper tackled the problem of assessing whether standard recurrent networks, specifically LSTMs, can perform dynamic counting and encode hierarchical representations, demonstrating that they can learn to recognize the well-balanced parenthesis language (Dyck-1) and shuffles of multiple Dyck-1 languages by emulating simple real-time k-counter machines, with a single-layer LSTM using only one hidden unit being sufficient for Dyck-1, but failing on Dyck-2 which requires a stack-like mechanism.

In this paper, we systematically assess the ability of standard recurrent networks to perform dynamic counting and to encode hierarchical representations. All the neural models in our experiments are designed to be small-sized networks both to prevent them from memorizing the training sets and to visualize and interpret their behaviour at test time. Our results demonstrate that the Long Short-Term Memory (LSTM) networks can learn to recognize the well-balanced parenthesis language (Dyck-$1$) and the shuffles of multiple Dyck-$1$ languages, each defined over different parenthesis-pairs, by emulating simple real-time $k$-counter machines. To the best of our knowledge, this work is the first study to introduce the shuffle languages to analyze the computational power of neural networks. We also show that a single-layer LSTM with only one hidden unit is practically sufficient for recognizing the Dyck-$1$ language. However, none of our recurrent networks was able to yield a good performance on the Dyck-$2$ language learning task, which requires a model to have a stack-like mechanism for recognition.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes