Probabilistic Line Searches for Stochastic Optimization
This addresses a bottleneck in stochastic optimization for machine learning practitioners by providing a method to automate learning rate selection, though it is incremental as it builds on existing deterministic and Bayesian techniques.
The paper tackles the problem of adapting line searches to stochastic optimization where only uncertain gradients are available, by constructing a probabilistic line search that combines deterministic methods with Bayesian optimization, resulting in an algorithm with low computational cost and no user-controlled parameters that effectively removes the need to define a learning rate for stochastic gradient descent.
In deterministic optimization, line searches are a standard tool ensuring stability and efficiency. Where only stochastic gradients are available, no direct equivalent has so far been formulated, because uncertain gradients do not allow for a strict sequence of decisions collapsing the search space. We construct a probabilistic line search by combining the structure of existing deterministic methods with notions from Bayesian optimization. Our method retains a Gaussian process surrogate of the univariate optimization objective, and uses a probabilistic belief over the Wolfe conditions to monitor the descent. The algorithm has very low computational cost, and no user-controlled parameters. Experiments show that it effectively removes the need to define a learning rate for stochastic gradient descent.