LGMay 19, 2016

One-shot Learning with Memory-Augmented Neural Networks

Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, Timothy Lillicrap

arXiv:1605.06065v135.8539 citations

Originality Incremental advance

AI Analysis

This addresses the problem of inefficient learning and catastrophic interference in neural networks for scenarios with limited data, though it appears incremental as it builds on existing memory-augmented architectures like Neural Turing Machines.

The paper tackles the challenge of one-shot learning in deep neural networks, which traditionally require extensive data and suffer from catastrophic interference when encountering new data, by demonstrating that a memory-augmented neural network can rapidly assimilate new data and make accurate predictions after only a few samples.

Despite recent breakthroughs in the applications of deep neural networks, one setting that presents a persistent challenge is that of "one-shot learning." Traditional gradient-based networks require a lot of data to learn, often through extensive iterative training. When new data is encountered, the models must inefficiently relearn their parameters to adequately incorporate the new information without catastrophic interference. Architectures with augmented memory capacities, such as Neural Turing Machines (NTMs), offer the ability to quickly encode and retrieve new information, and hence can potentially obviate the downsides of conventional models. Here, we demonstrate the ability of a memory-augmented neural network to rapidly assimilate new data, and leverage this data to make accurate predictions after only a few samples. We also introduce a new method for accessing an external memory that focuses on memory content, unlike previous methods that additionally use memory location-based focusing mechanisms.

View on arXiv PDF

Similar