LGCVSep 12, 2023

Elucidating the solution space of extended reverse-time SDE for diffusion models

arXiv:2309.06169v38 citationsh-index: 32Has Code
Originality Incremental advance
AI Analysis

This work addresses the efficiency-quality trade-off in diffusion models for image generation, offering a novel solver that combines the best of ODE and SDE methods, though it is incremental as it builds on existing frameworks.

The paper tackles the challenge of balancing speed and image quality in diffusion model sampling by formulating it as an Extended Reverse-Time SDE (ER SDE), unifying ODE and SDE approaches, and develops ER-SDE-Solvers that achieve state-of-the-art performance with 8.33 FID on ImageNet 128x128 in only 20 function evaluations.

Sampling from Diffusion Models can alternatively be seen as solving differential equations, where there is a challenge in balancing speed and image visual quality. ODE-based samplers offer rapid sampling time but reach a performance limit, whereas SDE-based samplers achieve superior quality, albeit with longer iterations. In this work, we formulate the sampling process as an Extended Reverse-Time SDE (ER SDE), unifying prior explorations into ODEs and SDEs. Theoretically, leveraging the semi-linear structure of ER SDE solutions, we offer exact solutions and approximate solutions for VP SDE and VE SDE, respectively. Based on the approximate solution space of the ER SDE, referred to as one-step prediction errors, we yield mathematical insights elucidating the rapid sampling capability of ODE solvers and the high-quality sampling ability of SDE solvers. Additionally, we unveil that VP SDE solvers stand on par with their VE SDE counterparts. Based on these findings, leveraging the dual advantages of ODE solvers and SDE solvers, we devise efficient high-quality samplers, namely ER-SDE-Solvers. Experimental results demonstrate that ER-SDE-Solvers achieve state-of-the-art performance across all stochastic samplers while maintaining efficiency of deterministic samplers. Specifically, on the ImageNet $128\times128$ dataset, ER-SDE-Solvers obtain 8.33 FID in only 20 function evaluations. Code is available at \href{https://github.com/QinpengCui/ER-SDE-Solver}{https://github.com/QinpengCui/ER-SDE-Solver}

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes