MLJan 27, 2018

Gradient descent revisited via an adaptive online learning rate

arXiv:1801.09136v22 citations
AI Analysis

This addresses the tedious and suboptimal tuning of learning rates in deep models, though it appears incremental as it builds on existing gradient descent frameworks.

The paper tackles the problem of manually tuning learning rates in gradient descent by proposing an adaptive method that learns the learning rate itself, either via first-order or second-order optimization, enabling optimization of gradient descent for any machine learning algorithm.

Any gradient descent optimization requires to choose a learning rate. With deeper and deeper models, tuning that learning rate can easily become tedious and does not necessarily lead to an ideal convergence. We propose a variation of the gradient descent algorithm in the which the learning rate is not fixed. Instead, we learn the learning rate itself, either by another gradient descent (first-order method), or by Newton's method (second-order). This way, gradient descent for any machine learning algorithm can be optimized.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes