CVGRNov 27, 2023

MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

arXiv:2311.16498v1386 citationsh-index: 27
Originality Incremental advance
AI Analysis

This addresses the challenge of maintaining identity and smoothness in human image animation for applications like video generation, though it is incremental in improving existing diffusion-based methods.

The paper tackles the problem of generating temporally consistent videos of a reference person following a motion sequence, achieving over 38% improvement in video fidelity on a TikTok dancing dataset compared to the strongest baseline.

This paper studies the human image animation task, which aims to generate a video of a certain reference identity following a particular motion sequence. Existing animation works typically employ the frame-warping technique to animate the reference image towards the target motion. Despite achieving reasonable results, these approaches face challenges in maintaining temporal consistency throughout the animation due to the lack of temporal modeling and poor preservation of reference identity. In this work, we introduce MagicAnimate, a diffusion-based framework that aims at enhancing temporal consistency, preserving reference image faithfully, and improving animation fidelity. To achieve this, we first develop a video diffusion model to encode temporal information. Second, to maintain the appearance coherence across frames, we introduce a novel appearance encoder to retain the intricate details of the reference image. Leveraging these two innovations, we further employ a simple video fusion technique to encourage smooth transitions for long video animation. Empirical results demonstrate the superiority of our method over baseline approaches on two benchmarks. Notably, our approach outperforms the strongest baseline by over 38% in terms of video fidelity on the challenging TikTok dancing dataset. Code and model will be made available.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes