MLLGOct 25, 2019

Bias-Variance Tradeoff in a Sliding Window Implementation of the Stochastic Gradient Algorithm

arXiv:1910.11868v11 citations
Originality Incremental advance
AI Analysis

This work addresses optimization efficiency for machine learning practitioners by providing an incremental improvement in algorithm design for specific problem types.

The paper tackles the bias-variance tradeoff in stochastic gradient algorithms by analyzing mean squared error (MSE) using asymptotic normality, introducing a sliding window SGD (SW-SGD) algorithm that achieves lower MSE than standard SGD on quadratic and convex problems, with numerical results demonstrating its effectiveness.

This paper provides a framework to analyze stochastic gradient algorithms in a mean squared error (MSE) sense using the asymptotic normality result of the stochastic gradient descent (SGD) iterates. We perform this analysis by taking the asymptotic normality result and applying it to the finite iteration case. Specifically, we look at problems where the gradient estimators are biased and have reduced variance and compare the iterates generated by these gradient estimators to the iterates generated by the SGD algorithm. We use the work of Fabian to characterize the mean and the variance of the distribution of the iterates in terms of the bias and the covariance matrix of the gradient estimators. We introduce the sliding window SGD (SW-SGD) algorithm, with its proof of convergence, which incurs a lower MSE than the SGD algorithm on quadratic and convex problems. Lastly, we present some numerical results to show the effectiveness of this framework and the superiority of SW-SGD algorithm over the SGD algorithm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes