AutoGD: Automatic Learning Rate Selection for Gradient Descent
This addresses the tuning burden for users in gradient-based optimization, particularly in nested algorithms, though it is incremental as it builds on existing GD methods.
The paper tackles the problem of manually tuning learning rates in gradient descent by introducing AutoGD, which automatically adjusts the learning rate at each iteration, and shows it recovers optimal GD rates without smoothness constants, with experiments demonstrating strong performance on traditional and variational inference tasks.
The performance of gradient-based optimization methods, such as standard gradient descent (GD), greatly depends on the choice of learning rate. However, it can require a non-trivial amount of user tuning effort to select an appropriate learning rate schedule. When such methods appear as inner loops of other algorithms, expecting the user to tune the learning rates may be impractical. To address this, we introduce AutoGD: a gradient descent method that automatically determines whether to increase or decrease the learning rate at a given iteration. We establish the convergence of AutoGD, and show that we can recover the optimal rate of GD (up to a constant) for a broad class of functions without knowledge of smoothness constants. Experiments on a variety of traditional problems and variational inference optimization tasks demonstrate strong performance of the method, along with its extensions to AutoBFGS and AutoLBFGS.