CVMMAug 7, 2023

DiffSynth: Latent In-Iteration Deflickering for Realistic Video Synthesis

arXiv:2308.03463v313 citationsh-index: 24
Originality Incremental advance
AI Analysis

This addresses the challenge of generating coherent videos for applications such as stylization and rendering, representing an incremental improvement over existing zero-shot methods.

The paper tackles the problem of flickering in video synthesis using diffusion models by proposing DiffSynth, which includes a latent in-iteration deflickering framework and a patch blending algorithm, resulting in high-quality videos without cherry-picking in tasks like text-guided video stylization.

In recent years, diffusion models have emerged as the most powerful approach in image synthesis. However, applying these models directly to video synthesis presents challenges, as it often leads to noticeable flickering contents. Although recently proposed zero-shot methods can alleviate flicker to some extent, we still struggle to generate coherent videos. In this paper, we propose DiffSynth, a novel approach that aims to convert image synthesis pipelines to video synthesis pipelines. DiffSynth consists of two key components: a latent in-iteration deflickering framework and a video deflickering algorithm. The latent in-iteration deflickering framework applies video deflickering to the latent space of diffusion models, effectively preventing flicker accumulation in intermediate steps. Additionally, we propose a video deflickering algorithm, named patch blending algorithm, that remaps objects in different frames and blends them together to enhance video consistency. One of the notable advantages of DiffSynth is its general applicability to various video synthesis tasks, including text-guided video stylization, fashion video synthesis, image-guided video stylization, video restoring, and 3D rendering. In the task of text-guided video stylization, we make it possible to synthesize high-quality videos without cherry-picking. The experimental results demonstrate the effectiveness of DiffSynth. All videos can be viewed on our project page. Source codes will also be released.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes