A sharp uniform-in-time error estimate for Stochastic Gradient Langevin Dynamics
This provides improved theoretical guarantees for SGLD, which is widely used in machine learning for sampling, but the work is incremental as it builds on existing analysis.
The paper tackles the problem of analyzing the error of Stochastic Gradient Langevin Dynamics (SGLD), a sampling algorithm, by establishing a sharp uniform-in-time error estimate, resulting in an O(η²) bound for KL-divergence and an O(η) bound for distances between invariant measures.
We establish a sharp uniform-in-time error estimate for the Stochastic Gradient Langevin Dynamics (SGLD), which is a widely-used sampling algorithm. Under mild assumptions, we obtain a uniform-in-time $O(η^2)$ bound for the KL-divergence between the SGLD iteration and the Langevin diffusion, where $η$ is the step size (or learning rate). Our analysis is also valid for varying step sizes. Consequently, we are able to derive an $O(η)$ bound for the distance between the invariant measures of the SGLD iteration and the Langevin diffusion, in terms of Wasserstein or total variation distances. Our result can be viewed as a significant improvement compared with existing analysis for SGLD in related literature.