OC LG PR MLOct 9, 2023

Ito Diffusion Approximation of Universal Ito Chains for Sampling, Optimization and Boosting

arXiv:2310.06081v24.41 citationsh-index: 18

Originality Incremental advance

AI Analysis

This work offers a general theoretical tool for analyzing stochastic algorithms in sampling, optimization, and boosting, though it appears incremental as it builds on existing discretization frameworks.

The authors developed a unified theoretical framework called Ito chains to analyze Markov chains resembling Euler-Maruyama discretizations of stochastic differential equations, proving improved bounds on the Wasserstein-2 distance between the chain and its continuous counterpart. Their results cover or enhance most existing estimates and provide the first analysis for some specific cases.

In this work, we consider rather general and broad class of Markov chains, Ito chains, that look like Euler-Maryama discretization of some Stochastic Differential Equation. The chain we study is a unified framework for theoretical analysis. It comes with almost arbitrary isotropic and state-dependent noise instead of normal and state-independent one as in most related papers. Moreover, in our chain the drift and diffusion coefficient can be inexact in order to cover wide range of applications as Stochastic Gradient Langevin Dynamics, sampling, Stochastic Gradient Descent or Stochastic Gradient Boosting. We prove the bound in $W_{2}$-distance between the laws of our Ito chain and corresponding differential equation. These results improve or cover most of the known estimates. And for some particular cases, our analysis is the first.

View on arXiv PDF

Similar