LGOCMLJul 31, 2018

Stochastic Gradient Descent with Biased but Consistent Gradient Estimators

arXiv:1807.11880v446 citations
Originality Incremental advance
AI Analysis

This work addresses the computational bottleneck in SGD for scenarios like graph-based machine learning, offering a more efficient alternative while maintaining theoretical guarantees, though it is incremental as it builds on prior empirical findings.

The paper tackles the problem of expensive unbiased gradient estimators in stochastic gradient descent (SGD) for interconnected data like graphs, showing that using consistent estimators achieves the same convergence behavior as unbiased ones across strongly convex, convex, and nonconvex objectives, with verification on synthetic and real-world data.

Stochastic gradient descent (SGD), which dates back to the 1950s, is one of the most popular and effective approaches for performing stochastic optimization. Research on SGD resurged recently in machine learning for optimizing convex loss functions and training nonconvex deep neural networks. The theory assumes that one can easily compute an unbiased gradient estimator, which is usually the case due to the sample average nature of empirical risk minimization. There exist, however, many scenarios (e.g., graphs) where an unbiased estimator may be as expensive to compute as the full gradient because training examples are interconnected. Recently, Chen et al. (2018) proposed using a consistent gradient estimator as an economic alternative. Encouraged by empirical success, we show, in a general setting, that consistent estimators result in the same convergence behavior as do unbiased ones. Our analysis covers strongly convex, convex, and nonconvex objectives. We verify the results with illustrative experiments on synthetic and real-world data. This work opens several new research directions, including the development of more efficient SGD updates with consistent estimators and the design of efficient training algorithms for large-scale graphs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes