LGAIDec 17, 2024

LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers

arXiv:2412.12444v341 citationsh-index: 20Has CodeAAAI
Originality Incremental advance
AI Analysis

This work addresses the computational bottleneck in generative AI models, offering a practical acceleration method for real-time applications, though it is incremental as it builds on existing diffusion transformer architectures.

The paper tackles the slow inference problem in Diffusion Transformers by proposing LazyDiT, a lazy learning framework that reuses cached results to skip redundant computations, achieving better performance than DDIM across multiple models and resolutions, with implementation on mobile devices showing similar latency.

Diffusion Transformers have emerged as the preeminent models for a wide array of generative tasks, demonstrating superior performance and efficacy across various applications. The promising results come at the cost of slow inference, as each denoising step requires running the whole transformer model with a large amount of parameters. In this paper, we show that performing the full computation of the model at each diffusion step is unnecessary, as some computations can be skipped by lazily reusing the results of previous steps. Furthermore, we show that the lower bound of similarity between outputs at consecutive steps is notably high, and this similarity can be linearly approximated using the inputs. To verify our demonstrations, we propose the \textbf{LazyDiT}, a lazy learning framework that efficiently leverages cached results from earlier steps to skip redundant computations. Specifically, we incorporate lazy learning layers into the model, effectively trained to maximize laziness, enabling dynamic skipping of redundant computations. Experimental results show that LazyDiT outperforms the DDIM sampler across multiple diffusion transformer models at various resolutions. Furthermore, we implement our method on mobile devices, achieving better performance than DDIM with similar latency. Code: https://github.com/shawnricecake/lazydit

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes