CVOct 12, 2024

DuoDiff: Accelerating Diffusion Models with a Dual-Backbone Approach

arXiv:2410.09633v21 citationsh-index: 6
Originality Incremental advance
AI Analysis

This addresses the inference speed bottleneck for users of diffusion models, though it is incremental as it builds on early-exiting techniques.

The paper tackles the slow inference of diffusion models by proposing a dual-backbone approach that uses a shallower network for initial steps and a deeper one for later steps, outperforming existing early-exit methods in speed and quality.

Diffusion models have achieved unprecedented performance in image generation, yet they suffer from slow inference due to their iterative sampling process. To address this, early-exiting has recently been proposed, where the depth of the denoising network is made adaptive based on the (estimated) difficulty of each sampling step. Here, we discover an interesting "phase transition" in the sampling process of current adaptive diffusion models: the denoising network consistently exits early during the initial sampling steps, until it suddenly switches to utilizing the full network. Based on this, we propose accelerating generation by employing a shallower denoising network in the initial sampling steps and a deeper network in the later steps. We demonstrate empirically that our dual-backbone approach, DuoDiff, outperforms existing early-exit diffusion methods in both inference speed and generation quality. Importantly, DuoDiff is easy to implement and complementary to existing approaches for accelerating diffusion.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes