CVAILGJun 19, 2023

GD-VDM: Generated Depth for better Diffusion-based Video Generation

arXiv:2306.11173v114 citationsh-index: 27Has Code
Originality Incremental advance
AI Analysis

It addresses the problem of coherent video generation for AI and computer vision applications, but appears incremental as it builds on existing diffusion and Vid2Vid methods.

The paper tackles video generation of complex scenes by proposing GD-VDM, a diffusion model that uses generated depth videos and a Vid2Vid model, resulting in more diverse and complex scenes on the Cityscapes dataset compared to baselines.

The field of generative models has recently witnessed significant progress, with diffusion models showing remarkable performance in image generation. In light of this success, there is a growing interest in exploring the application of diffusion models to other modalities. One such challenge is the generation of coherent videos of complex scenes, which poses several technical difficulties, such as capturing temporal dependencies and generating long, high-resolution videos. This paper proposes GD-VDM, a novel diffusion model for video generation, demonstrating promising results. GD-VDM is based on a two-phase generation process involving generating depth videos followed by a novel diffusion Vid2Vid model that generates a coherent real-world video. We evaluated GD-VDM on the Cityscapes dataset and found that it generates more diverse and complex scenes compared to natural baselines, demonstrating the efficacy of our approach.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes