Multilevel and Sequential Monte Carlo for Training-Free Diffusion Guidance
This addresses the need for efficient and unbiased conditional generation in diffusion models for applications like image synthesis, though it is incremental as it builds on existing Monte Carlo methods.
The paper tackled the problem of accurate, training-free guidance for conditional generation in diffusion models by proposing a sequential Monte Carlo framework with multi-level variance reduction, achieving state-of-the-art results including 95.6% accuracy on CIFAR-10 with 3x lower cost-per-success than baselines.
We address the problem of accurate, training-free guidance for conditional generation in trained diffusion models. Existing methods typically rely on point-estimates to approximate the posterior score, often resulting in biased approximations that fail to capture multimodality inherent to the reverse process of diffusion models. We propose a sequential Monte Carlo (SMC) framework that constructs an unbiased estimator of $p_θ(y|x_t)$ by integrating over the full denoising distribution via Monte Carlo approximation. To ensure computational tractability, we incorporate variance-reduction schemes based on Multi-Level Monte Carlo (MLMC). Our approach achieves new state-of-the-art results for training-free guidance on CIFAR-10 class-conditional generation, achieving $95.6\%$ accuracy with $3\times$ lower cost-per-success than baselines. On ImageNet, our algorithm achieves $1.5\times$ cost-per-success advantage over existing methods.