Estimate Sequences for Variance-Reduced Stochastic Composite Optimization
This work provides a theoretical framework for improving optimization algorithms in machine learning, but it is incremental as it builds on existing methods.
The authors tackled the problem of stochastic convex composite optimization by extending Nesterov's estimate sequence concept to unify gradient-based algorithms like SGD, SAGA, and SVRG, resulting in a generic convergence proof, adaptive strong convexity, new algorithms with guarantees, and strategies for robustness to noise.
In this paper, we propose a unified view of gradient-based algorithms for stochastic convex composite optimization by extending the concept of estimate sequence introduced by Nesterov. This point of view covers the stochastic gradient descent method, variants of the approaches SAGA, SVRG, and has several advantages: (i) we provide a generic proof of convergence for the aforementioned methods; (ii) we show that this SVRG variant is adaptive to strong convexity; (iii) we naturally obtain new algorithms with the same guarantees; (iv) we derive generic strategies to make these algorithms robust to stochastic noise, which is useful when data is corrupted by small random perturbations. Finally, we show that this viewpoint is useful to obtain new accelerated algorithms in the sense of Nesterov.