LG MLJan 10, 2025

From discrete-time policies to continuous-time diffusion samplers: Asymptotic equivalences and faster training

Julius Berner, Lorenz Richter, Marcin Sendera, Jarrid Rector-Brooks, Nikolay Malkin

arXiv:2501.06148v123.318 citationsh-index: 29Has Code

Originality Incremental advance

AI Analysis

This work addresses the computational efficiency challenge in training diffusion models for sampling tasks, offering a method that reduces costs while maintaining performance, though it appears incremental as it builds on existing time-reversal and RL approaches.

The paper tackles the problem of training neural stochastic differential equations (diffusion models) to sample from Boltzmann distributions without target samples, proving equivalences between entropic RL methods and continuous-time objects in infinitesimal discretization limits and demonstrating that coarse time discretization during training improves sample efficiency and reduces computational cost while achieving competitive performance on standard benchmarks.

We study the problem of training neural stochastic differential equations, or diffusion models, to sample from a Boltzmann distribution without access to target samples. Existing methods for training such models enforce time-reversal of the generative and noising processes, using either differentiable simulation or off-policy reinforcement learning (RL). We prove equivalences between families of objectives in the limit of infinitesimal discretization steps, linking entropic RL methods (GFlowNets) with continuous-time objects (partial differential equations and path space measures). We further show that an appropriate choice of coarse time discretization during training allows greatly improved sample efficiency and the use of time-local objectives, achieving competitive performance on standard sampling benchmarks with reduced computational cost.

View on arXiv PDF Code

Similar