LGFeb 9, 2016

Learning Efficient Algorithms with Hierarchical Attentive Memory

arXiv:1602.03218v253 citations
AI Analysis

This addresses a bottleneck in neural network efficiency for algorithm learning, offering a novel architecture with broad applications in sequence processing.

The paper tackles the problem of inefficient memory access in neural networks by proposing Hierarchical Attentive Memory (HAM), which reduces complexity from O(n) to O(log n) and enables learning algorithms like sorting in O(n log n) time with strong generalization to longer sequences.

In this paper, we propose and investigate a novel memory architecture for neural networks called Hierarchical Attentive Memory (HAM). It is based on a binary tree with leaves corresponding to memory cells. This allows HAM to perform memory access in O(log n) complexity, which is a significant improvement over the standard attention mechanism that requires O(n) operations, where n is the size of the memory. We show that an LSTM network augmented with HAM can learn algorithms for problems like merging, sorting or binary searching from pure input-output examples. In particular, it learns to sort n numbers in time O(n log n) and generalizes well to input sequences much longer than the ones seen during the training. We also show that HAM can be trained to act like classic data structures: a stack, a FIFO queue and a priority queue.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes