VarDiU: A Variational Diffusive Upper Bound for One-Step Diffusion Distillation
This work addresses a specific bottleneck in diffusion distillation for researchers and practitioners aiming to deploy efficient generative models, representing an incremental improvement over existing techniques.
The paper tackles the problem of biased gradient estimation in diffusion distillation, which leads to sub-optimal performance when compressing multi-step diffusion models into one-step generators. It proposes VarDiU, a variational diffusive upper bound that provides an unbiased gradient estimator, resulting in higher generation quality and more efficient, stable training compared to methods like Diff-Instruct.
Recently, diffusion distillation methods have compressed thousand-step teacher diffusion models into one-step student generators while preserving sample quality. Most existing approaches train the student model using a diffusive divergence whose gradient is approximated via the student's score function, learned through denoising score matching (DSM). Since DSM training is imperfect, the resulting gradient estimate is inevitably biased, leading to sub-optimal performance. In this paper, we propose VarDiU (pronounced /va:rdju:/), a Variational Diffusive Upper Bound that admits an unbiased gradient estimator and can be directly applied to diffusion distillation. Using this objective, we compare our method with Diff-Instruct and demonstrate that it achieves higher generation quality and enables a more efficient and stable training procedure for one-step diffusion distillation.