LG MLFeb 7, 2024

Improved off-policy training of diffusion samplers

Marcin Sendera, Minsu Kim, Sarthak Mittal, Pablo Lemos, Luca Scimeca, Jarrid Rector-Brooks, Alexandre Adam, Yoshua Bengio, Nikolay Malkin

arXiv:2402.05098v428.247 citationsh-index: 56Has CodeNIPS

Originality Incremental advance

AI Analysis

This work addresses sampling challenges in diffusion models for amortized inference, offering incremental improvements to off-policy training methods.

The paper tackles the problem of training diffusion models to sample from distributions with given unnormalized densities, benchmarking existing methods and proposing a novel exploration strategy using local search and replay buffers. The result shows improved sample quality on various target distributions, with code made publicly available for future work.

We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function. We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods (continuous generative flow networks). Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work. We also propose a novel exploration strategy for off-policy methods, based on local search in the target space with the use of a replay buffer, and show that it improves the quality of samples on a variety of target distributions. Our code for the sampling methods and benchmarks studied is made public at https://github.com/GFNOrg/gfn-diffusion as a base for future work on diffusion models for amortized inference.

View on arXiv PDF Code

Similar