Stochastic Learning Rate Optimization in the Stochastic Approximation and Online Learning Settings
This work addresses optimization efficiency for machine learning practitioners, but it is incremental as it builds on existing stochastic gradient methods.
The paper tackles the problem of optimizing learning rates in stochastic and online settings by introducing multiplicative stochasticity to the learning rate, resulting in noticeable performance gains over deterministic versions.
In this work, multiplicative stochasticity is applied to the learning rate of stochastic optimization algorithms, giving rise to stochastic learning-rate schemes. In-expectation theoretical convergence results of Stochastic Gradient Descent equipped with this novel stochastic learning rate scheme under the stochastic setting, as well as convergence results under the online optimization settings are provided. Empirical results consider the case of an adaptively uniformly distributed multiplicative stochasticity and include not only Stochastic Gradient Descent, but also other popular algorithms equipped with a stochastic learning rate. They demonstrate noticeable optimization performance gains, with respect to their deterministic-learning-rate versions.