CVAILGJun 4, 2024

Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation

arXiv:2406.02347v354 citationsHas Code
Originality Highly original
AI Analysis

This addresses the computational bottleneck in diffusion models for researchers and practitioners by enabling faster, high-quality image generation across various tasks.

The paper introduces Flash Diffusion, a distillation method that accelerates pre-trained diffusion models for few-step image generation, achieving state-of-the-art FID and CLIP-Score on COCO datasets with minimal training time and parameters.

In this paper, we propose an efficient, fast, and versatile distillation method to accelerate the generation of pre-trained diffusion models: Flash Diffusion. The method reaches state-of-the-art performances in terms of FID and CLIP-Score for few steps image generation on the COCO2014 and COCO2017 datasets, while requiring only several GPU hours of training and fewer trainable parameters than existing methods. In addition to its efficiency, the versatility of the method is also exposed across several tasks such as text-to-image, inpainting, face-swapping, super-resolution and using different backbones such as UNet-based denoisers (SD1.5, SDXL) or DiT (Pixart-$α$), as well as adapters. In all cases, the method allowed to reduce drastically the number of sampling steps while maintaining very high-quality image generation. The official implementation is available at https://github.com/gojasper/flash-diffusion.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes