Implementing Neural Turing Machines
This provides a more stable implementation for researchers working with memory-augmented neural networks, though it is incremental as it focuses on implementation details rather than novel methods.
The paper tackles the challenge of unstable training in existing Neural Turing Machine implementations by developing a successful implementation that learns three sequential learning tasks, finding that memory contents initialized to small constant values converge 2 times faster than the next best scheme.
Neural Turing Machines (NTMs) are an instance of Memory Augmented Neural Networks, a new class of recurrent neural networks which decouple computation from memory by introducing an external memory unit. NTMs have demonstrated superior performance over Long Short-Term Memory Cells in several sequence learning tasks. A number of open source implementations of NTMs exist but are unstable during training and/or fail to replicate the reported performance of NTMs. This paper presents the details of our successful implementation of a NTM. Our implementation learns to solve three sequential learning tasks from the original NTM paper. We find that the choice of memory contents initialization scheme is crucial in successfully implementing a NTM. Networks with memory contents initialized to small constant values converge on average 2 times faster than the next best memory contents initialization scheme.