CVApr 17

Efficient Video Diffusion Models: Advancements and Challenges

arXiv:2604.1591179.21 citationsh-index: 13
AI Analysis

For researchers and engineers working on video generation, this survey provides a structured overview of efficiency techniques, but it is a literature review without novel contributions.

This survey systematically reviews efficient video diffusion models, categorizing methods into step distillation, efficient attention, model compression, and cache/trajectory optimization, and analyzes their impact on reducing inference costs. It identifies open challenges such as quality preservation under composite acceleration and hardware-software co-design.

Video diffusion models have rapidly become the dominant paradigm for high-fidelity generative video synthesis, but their practical deployment remains constrained by severe inference costs. Compared with image generation, video synthesis compounds computation across spatial-temporal token growth and iterative denoising, making attention and memory traffic major bottlenecks in real-world settings. This survey provides a systematic and deployment-oriented review of efficient video diffusion models. We propose a unified categorization that organizes existing methods into four classes of main paradigms, including step distillation, efficient attention, model compression, and cache/trajectory optimization. Building on this categorization, we respectively analyze algorithmic trends of these four paradigms and examine how different design choices target two core objectives: reducing the number of function evaluations and minimizing per-step overhead. Finally, we discuss open challenges and future directions, including quality preservation under composite acceleration, hardware-software co-design, robust real-time long-horizon generation, and open infrastructure for standardized evaluation. To the best of our knowledge, our work is the first comprehensive survey on efficient video diffusion models, offering researchers and engineers a structured overview of the field and its emerging research directions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes