Improved off-policy training of diffusion samplers
This work addresses sampling challenges in diffusion models for amortized inference, offering incremental improvements to off-policy training methods.
The paper tackles the problem of training diffusion models to sample from distributions with given unnormalized densities, benchmarking existing methods and proposing a novel exploration strategy using local search and replay buffers. The result shows improved sample quality on various target distributions, with code made publicly available for future work.
We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function. We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods (continuous generative flow networks). Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work. We also propose a novel exploration strategy for off-policy methods, based on local search in the target space with the use of a replay buffer, and show that it improves the quality of samples on a variety of target distributions. Our code for the sampling methods and benchmarks studied is made public at https://github.com/GFNOrg/gfn-diffusion as a base for future work on diffusion models for amortized inference.