Linearly convergent stochastic heavy ball method for minimizing generalization error
This work addresses the convergence analysis of stochastic optimization methods for machine learning practitioners, but it is incremental as it builds on existing momentum techniques with a specific focus.
The paper tackled the problem of establishing linear convergence for the stochastic heavy ball method, achieving the first such result for minimizing expected loss with quadratic loss functions, though the overall objective is not necessarily strongly convex.
In this work we establish the first linear convergence result for the stochastic heavy ball method. The method performs SGD steps with a fixed stepsize, amended by a heavy ball momentum term. In the analysis, we focus on minimizing the expected loss and not on finite-sum minimization, which is typically a much harder problem. While in the analysis we constrain ourselves to quadratic loss, the overall objective is not necessarily strongly convex.