LGMLOct 2, 2020

A straightforward line search approach on the expected empirical loss for stochastic deep learning problems

arXiv:2010.00921v12 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of step size selection in deep learning optimization, offering a practical solution for researchers and practitioners, though it is incremental as it builds on traditional line search techniques.

The paper tackles the challenge of unknown optimal step sizes in stochastic gradient descent for deep learning by proposing a method to cheaply approximate the expected empirical loss using one-dimensional function fitting on noisy losses, resulting in a robust optimization method that performs well across datasets and architectures without hyperparameter tuning.

A fundamental challenge in deep learning is that the optimal step sizes for update steps of stochastic gradient descent are unknown. In traditional optimization, line searches are used to determine good step sizes, however, in deep learning, it is too costly to search for good step sizes on the expected empirical loss due to noisy losses. This empirical work shows that it is possible to approximate the expected empirical loss on vertical cross sections for common deep learning tasks considerably cheaply. This is achieved by applying traditional one-dimensional function fitting to measured noisy losses of such cross sections. The step to a minimum of the resulting approximation is then used as step size for the optimization. This approach leads to a robust and straightforward optimization method which performs well across datasets and architectures without the need of hyperparameter tuning.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes