LGCLMLMay 13, 2018

On the Practical Computational Power of Finite Precision RNNs for Language Recognition

arXiv:1805.04908v11201 citations
Originality Incremental advance
AI Analysis

This work addresses the theoretical limitations of RNNs for language recognition, providing insights into model selection for practical applications.

The paper investigates the computational power of finite precision RNNs with linear computation time, showing that LSTM and ReLU-RNNs are strictly stronger than squashing activation RNNs and GRUs due to their ability to implement counting behavior, with empirical evidence that LSTMs effectively learn this mechanism.

While Recurrent Neural Networks (RNNs) are famously known to be Turing complete, this relies on infinite precision in the states and unbounded computation time. We consider the case of RNNs with finite precision whose computation time is linear in the input length. Under these limitations, we show that different RNN variants have different computational power. In particular, we show that the LSTM and the Elman-RNN with ReLU activation are strictly stronger than the RNN with a squashing activation and the GRU. This is achieved because LSTMs and ReLU-RNNs can easily implement counting behavior. We show empirically that the LSTM does indeed learn to effectively use the counting mechanism.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes