Stochastic Conjugate Gradient Algorithm with Variance Reduction
This work addresses efficiency in optimization algorithms for machine learning, offering incremental improvements in computational speed for large-scale datasets.
The authors tackled the problem of improving conjugate gradient methods for optimization by proposing a stochastic variant with variance reduction, achieving linear convergence for strongly convex and smooth functions and demonstrating faster convergence in experiments on various learning models.
Conjugate gradient (CG) methods are a class of important methods for solving linear equations and nonlinear optimization problems. In this paper, we propose a new stochastic CG algorithm with variance reduction and we prove its linear convergence with the Fletcher and Reeves method for strongly convex and smooth functions. We experimentally demonstrate that the CG with variance reduction algorithm converges faster than its counterparts for four learning models, which may be convex, nonconvex or nonsmooth. In addition, its area under the curve performance on six large-scale data sets is comparable to that of the LIBLINEAR solver for the L2-regularized L2-loss but with a significant improvement in computational efficiency