HiGS: History-Guided Sampling for Plug-and-Play Enhancement of Diffusion Models
This is an incremental improvement for users of diffusion models in image generation, enabling faster and higher-fidelity outputs without requiring retraining.
The paper tackles the problem of diffusion models producing unrealistic images with few sampling steps or low guidance scales by proposing HiGS, a momentum-based sampling technique that integrates past predictions to improve detail and structure. The result is a new state-of-the-art FID of 1.61 for unguided ImageNet generation at 256x256 with only 30 steps, enhancing quality and efficiency without extra computation.
While diffusion models have made remarkable progress in image generation, their outputs can still appear unrealistic and lack fine details, especially when using fewer number of neural function evaluations (NFEs) or lower guidance scales. To address this issue, we propose a novel momentum-based sampling technique, termed history-guided sampling (HiGS), which enhances quality and efficiency of diffusion sampling by integrating recent model predictions into each inference step. Specifically, HiGS leverages the difference between the current prediction and a weighted average of past predictions to steer the sampling process toward more realistic outputs with better details and structure. Our approach introduces practically no additional computation and integrates seamlessly into existing diffusion frameworks, requiring neither extra training nor fine-tuning. Extensive experiments show that HiGS consistently improves image quality across diverse models and architectures and under varying sampling budgets and guidance scales. Moreover, using a pretrained SiT model, HiGS achieves a new state-of-the-art FID of 1.61 for unguided ImageNet generation at 256$\times$256 with only 30 sampling steps (instead of the standard 250). We thus present HiGS as a plug-and-play enhancement to standard diffusion sampling that enables faster generation with higher fidelity.