STOCPRMLApr 3, 2019

Normal Approximation for Stochastic Gradient Descent via Non-Asymptotic Rates of Martingale CLT

arXiv:1904.02130v148 citations
Originality Incremental advance
AI Analysis

This work addresses the need for reliable statistical inference in machine learning by enabling non-asymptotic confidence intervals and hypothesis tests for SGD, which is incremental as it builds on existing non-asymptotic analysis of averaged SGD.

The paper tackles the problem of establishing non-asymptotic convergence rates for Polyak-Ruppert averaged stochastic gradient descent (SGD) to a normal distribution, providing explicit rates for a multivariate martingale central limit theorem and applying it to derive confidence intervals and hypothesis tests for SGD-based parameter estimation.

We provide non-asymptotic convergence rates of the Polyak-Ruppert averaged stochastic gradient descent (SGD) to a normal random vector for a class of twice-differentiable test functions. A crucial intermediate step is proving a non-asymptotic martingale central limit theorem (CLT), i.e., establishing the rates of convergence of a multivariate martingale difference sequence to a normal random vector, which might be of independent interest. We obtain the explicit rates for the multivariate martingale CLT using a combination of Stein's method and Lindeberg's argument, which is then used in conjunction with a non-asymptotic analysis of averaged SGD proposed in [PJ92]. Our results have potentially interesting consequences for computing confidence intervals for parameter estimation with SGD and constructing hypothesis tests with SGD that are valid in a non-asymptotic sense.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes