Analysis of Biased Stochastic Gradient Descent Using Sequential Semidefinite Programs
This work addresses convergence analysis for biased SGD, which is incremental as it extends existing SGD theory to account for gradient errors, relevant for optimization in machine learning.
The paper tackled the problem of analyzing convergence rates for biased stochastic gradient descent (SGD) with corrupted gradient updates, developing a method using stochastic quadratic constraints and linear matrix inequalities to derive theoretical bounds and formulas that quantify convergence properties under different loss function assumptions.
We present a convergence rate analysis for biased stochastic gradient descent (SGD), where individual gradient updates are corrupted by computation errors. We develop stochastic quadratic constraints to formulate a small linear matrix inequality (LMI) whose feasible points lead to convergence bounds of biased SGD. Based on this LMI condition, we develop a sequential minimization approach to analyze the intricate trade-offs that couple stepsize selection, convergence rate, optimization accuracy, and robustness to gradient inaccuracy. We also provide feasible points for this LMI and obtain theoretical formulas that quantify the convergence properties of biased SGD under various assumptions on the loss functions.