CLJan 6, 2016

Recurrent Memory Networks for Language Modeling

arXiv:1601.01272v249 citations
AI Analysis

This work addresses the problem of interpretability in RNNs for NLP researchers and practitioners, offering a novel architecture with strong performance gains, though it is incremental in advancing existing RNN methods.

The paper tackles the challenge of understanding and interpreting RNNs in NLP by proposing a Recurrent Memory Network (RMN) that enhances RNN performance and interpretability, achieving 69.2% accuracy on the Sentence Completion Challenge and outperforming LSTM on language modeling across three large datasets.

Recurrent Neural Networks (RNN) have obtained excellent result in many natural language processing (NLP) tasks. However, understanding and interpreting the source of this success remains a challenge. In this paper, we propose Recurrent Memory Network (RMN), a novel RNN architecture, that not only amplifies the power of RNN but also facilitates our understanding of its internal functioning and allows us to discover underlying patterns in data. We demonstrate the power of RMN on language modeling and sentence completion tasks. On language modeling, RMN outperforms Long Short-Term Memory (LSTM) network on three large German, Italian, and English dataset. Additionally we perform in-depth analysis of various linguistic dimensions that RMN captures. On Sentence Completion Challenge, for which it is essential to capture sentence coherence, our RMN obtains 69.2% accuracy, surpassing the previous state-of-the-art by a large margin.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes