CO LG ME MLJul 25, 2022

Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics

Jeffrey Negrea, Jun Yang, Haoyue Feng, Daniel M. Roy, Jonathan H. Huggins

U of Toronto

arXiv:2207.12395v33.31 citationsh-index: 38

Originality Highly original

AI Analysis

This work addresses the theory-practice gap in tuning SGAs for statisticians and machine learning practitioners, providing a systematic foundation for inference, though it is incremental as it builds on existing SGA methods.

The authors tackled the problem of tuning stochastic gradient algorithms (SGAs) for statistical inference by developing a theoretical framework based on large-sample asymptotics, showing that iterate averaging with a large fixed step size yields covariance proportional to the MLE sampling distribution and is robust to tuning parameters. They validated their results with numerical experiments in finite-sample regimes.

The tuning of stochastic gradient algorithms (SGAs) for optimization and sampling is often based on heuristics and trial-and-error rather than generalizable theory. We address this theory--practice gap by characterizing the large-sample statistical asymptotics of SGAs via a joint step-size--sample-size scaling limit. We show that iterate averaging with a large fixed step size is robust to the choice of tuning parameters and asymptotically has covariance proportional to that of the MLE sampling distribution. We also prove a Bernstein--von Mises-like theorem to guide tuning, including for generalized posteriors that are robust to model misspecification. Numerical experiments validate our results and recommendations in realistic finite-sample regimes. Our work lays the foundation for a systematic analysis of other stochastic gradient Markov chain Monte Carlo algorithms for a wide range of models.

View on arXiv PDF

Similar