ML LGNov 11, 2016

Learning to Learn without Gradient Descent by Gradient Descent

Yutian Chen, Matthew W. Hoffman, Sergio Gomez Colmenarejo, Misha Denil, Timothy P. Lillicrap, Matt Botvinick, Nando de Freitas

arXiv:1611.03824v634.6121 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of efficient optimization for a broad range of applications, including hyper-parameter tuning, but is incremental as it builds on existing gradient descent and neural network methods.

The paper tackled the problem of optimizing derivative-free black-box functions by training recurrent neural network optimizers on synthetic functions, achieving performance comparable to engineered Bayesian optimization packages in hyper-parameter tuning tasks.

We learn recurrent neural network optimizers trained on simple synthetic functions by gradient descent. We show that these learned optimizers exhibit a remarkable degree of transfer in that they can be used to efficiently optimize a broad range of derivative-free black-box functions, including Gaussian process bandits, simple control objectives, global optimization benchmarks and hyper-parameter tuning tasks. Up to the training horizon, the learned optimizers learn to trade-off exploration and exploitation, and compare favourably with heavily engineered Bayesian optimization packages for hyper-parameter tuning.

View on arXiv PDF

Similar