CVFeb 11, 2025

Enhance-A-Video: Better Generated Video for Free

Yang Luo, Xuanlei Zhao, Mengzhao Chen, Kaipeng Zhang, Wenqi Shao, Kai Wang, Zhangyang Wang, Yang You

arXiv:2502.07508v317.410 citationsh-index: 20Has Code

Originality Incremental advance

AI Analysis

This work addresses video generation enhancement for AI researchers and practitioners, but it is incremental as it builds on existing DiT-based frameworks without retraining.

The paper tackles the problem of improving coherence and quality in DiT-based generated videos by introducing Enhance-A-Video, a training-free approach that enhances cross-frame correlations using non-diagonal temporal attention distributions, resulting in promising improvements in temporal consistency and visual quality across various models.

DiT-based video generation has achieved remarkable results, but research into enhancing existing models remains relatively unexplored. In this work, we introduce a training-free approach to enhance the coherence and quality of DiT-based generated videos, named Enhance-A-Video. The core idea is enhancing the cross-frame correlations based on non-diagonal temporal attention distributions. Thanks to its simple design, our approach can be easily applied to most DiT-based video generation frameworks without any retraining or fine-tuning. Across various DiT-based video generation models, our approach demonstrates promising improvements in both temporal consistency and visual quality. We hope this research can inspire future explorations in video generation enhancement.

View on arXiv PDF Code

Similar