LGAIMay 6, 2025

Diffusion Models are Secretly Exchangeable: Parallelizing DDPMs via Autospeculation

arXiv:2505.03983v28 citationsh-index: 66ICML
Originality Incremental advance
AI Analysis

This work addresses the computational inefficiency of diffusion models for users in generative AI, offering a novel method to speed up inference, though it is incremental as it adapts existing techniques to a new setting.

The paper tackles the inference-time bottleneck in Denoising Diffusion Probabilistic Models (DDPMs) by proving an exchangeability property, enabling near-black-box adaptation of optimization techniques from autoregressive models, and introduces Autospeculative Decoding (ASD) to achieve a $ ilde{O}(K^{ rac{1}{3}})$ parallel runtime speedup over sequential DDPMs, with practical implementations showing significant acceleration in various domains.

Denoising Diffusion Probabilistic Models (DDPMs) have emerged as powerful tools for generative modeling. However, their sequential computation requirements lead to significant inference-time bottlenecks. In this work, we utilize the connection between DDPMs and Stochastic Localization to prove that, under an appropriate reparametrization, the increments of DDPM satisfy an exchangeability property. This general insight enables near-black-box adaptation of various performance optimization techniques from autoregressive models to the diffusion setting. To demonstrate this, we introduce \emph{Autospeculative Decoding} (ASD), an extension of the widely used speculative decoding algorithm to DDPMs that does not require any auxiliary draft models. Our theoretical analysis shows that ASD achieves a $\tilde{O} (K^{\frac{1}{3}})$ parallel runtime speedup over the $K$ step sequential DDPM. We also demonstrate that a practical implementation of autospeculative decoding accelerates DDPM inference significantly in various domains.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes