CVJan 2, 2025

SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration

arXiv:2501.01320v452 citationsh-index: 24CVPR
AI Analysis

This work addresses video restoration for real-world applications, offering an incremental improvement over existing methods.

The paper tackled the problem of video restoration with arbitrary length and resolution by proposing SeedVR, a diffusion transformer that achieves highly-competitive performance on synthetic, real-world, and AI-generated video benchmarks.

Video restoration poses non-trivial challenges in maintaining fidelity while recovering temporally consistent details from unknown degradations in the wild. Despite recent advances in diffusion-based restoration, these methods often face limitations in generation capability and sampling efficiency. In this work, we present SeedVR, a diffusion transformer designed to handle real-world video restoration with arbitrary length and resolution. The core design of SeedVR lies in the shifted window attention that facilitates effective restoration on long video sequences. SeedVR further supports variable-sized windows near the boundary of both spatial and temporal dimensions, overcoming the resolution constraints of traditional window attention. Equipped with contemporary practices, including causal video autoencoder, mixed image and video training, and progressive training, SeedVR achieves highly-competitive performance on both synthetic and real-world benchmarks, as well as AI-generated videos. Extensive experiments demonstrate SeedVR's superiority over existing methods for generic video restoration.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes