NELGMar 3, 2015

Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets

arXiv:1503.01007v4425 citations
AI Analysis

This addresses limitations in deep learning for achieving artificial intelligence by enabling models to handle more complex sequence prediction tasks.

The paper tackles the problem of learning algorithmically generated sequences that require counting and memorization, which standard recurrent networks cannot handle, by using a recurrent network with trainable memory to learn basic algorithms from sequential data.

Despite the recent achievements in machine learning, we are still very far from achieving real artificial intelligence. In this paper, we discuss the limitations of standard deep learning approaches and show that some of these limitations can be overcome by learning how to grow the complexity of a model in a structured way. Specifically, we study the simplest sequence prediction problems that are beyond the scope of what is learnable with standard recurrent networks, algorithmically generated sequences which can only be learned by models which have the capacity to count and to memorize sequences. We show that some basic algorithms can be learned from sequential data using a recurrent network associated with a trainable memory.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes