AILGMLSep 21, 2017

Neural Optimizer Search with Reinforcement Learning

arXiv:1709.07417v2407 citations
AI Analysis

This addresses the challenge of manually designing optimizers for machine learning practitioners, offering an automated approach that yields improved performance across various tasks.

The paper tackles the problem of automating the discovery of optimization methods for deep learning by using a reinforcement learning-trained RNN controller to generate update equations, resulting in new optimizers like PowerSign and AddSign that outperform common methods such as Adam and SGD on tasks like CIFAR-10 and transfer well to ImageNet and neural machine translation.

We present an approach to automate the process of discovering optimization methods, with a focus on deep learning architectures. We train a Recurrent Neural Network controller to generate a string in a domain specific language that describes a mathematical update equation based on a list of primitive functions, such as the gradient, running average of the gradient, etc. The controller is trained with Reinforcement Learning to maximize the performance of a model after a few epochs. On CIFAR-10, our method discovers several update rules that are better than many commonly used optimizers, such as Adam, RMSProp, or SGD with and without Momentum on a ConvNet model. We introduce two new optimizers, named PowerSign and AddSign, which we show transfer well and improve training on a variety of different tasks and architectures, including ImageNet classification and Google's neural machine translation system.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes