CVOct 12, 2024

DuoDiff: Accelerating Diffusion Models with a Dual-Backbone Approach

Daniel Gallo Fernández, Răzvan-Andrei Matişan, Alejandro Monroy Muñoz, Ana-Maria Vasilcoiu, Janusz Partyka, Tin Hadži Veljković, Metod Jazbec

arXiv:2410.09633v23.71 citationsh-index: 6Has Code

Originality Incremental advance

AI Analysis

This addresses the inference speed bottleneck for users of diffusion models, though it is incremental as it builds on early-exiting techniques.

The paper tackles the slow inference of diffusion models by proposing a dual-backbone approach that uses a shallower network for initial steps and a deeper one for later steps, outperforming existing early-exit methods in speed and quality.

Diffusion models have achieved unprecedented performance in image generation, yet they suffer from slow inference due to their iterative sampling process. To address this, early-exiting has recently been proposed, where the depth of the denoising network is made adaptive based on the (estimated) difficulty of each sampling step. Here, we discover an interesting "phase transition" in the sampling process of current adaptive diffusion models: the denoising network consistently exits early during the initial sampling steps, until it suddenly switches to utilizing the full network. Based on this, we propose accelerating generation by employing a shallower denoising network in the initial sampling steps and a deeper network in the later steps. We demonstrate empirically that our dual-backbone approach, DuoDiff, outperforms existing early-exit diffusion methods in both inference speed and generation quality. Importantly, DuoDiff is easy to implement and complementary to existing approaches for accelerating diffusion.

View on arXiv PDF Code

Similar