CVGRMay 29, 2025

How Animals Dance (When You're Not Looking)

UW
arXiv:2505.23738v1h-index: 33
Originality Synthesis-oriented
AI Analysis

This addresses the niche problem of creating entertaining animal dance content for media or social media applications, but it is incremental as it builds on existing text-to-image and video diffusion methods.

The paper tackles the problem of generating music-synchronized animal dance videos by developing a keyframe-based framework that formulates dance synthesis as a graph optimization problem, producing up to 30-second videos from as few as six input keyframes across various animals and music tracks.

We present a keyframe-based framework for generating music-synchronized, choreography aware animal dance videos. Starting from a few keyframes representing distinct animal poses -- generated via text-to-image prompting or GPT-4o -- we formulate dance synthesis as a graph optimization problem: find the optimal keyframe structure that satisfies a specified choreography pattern of beats, which can be automatically estimated from a reference dance video. We also introduce an approach for mirrored pose image generation, essential for capturing symmetry in dance. In-between frames are synthesized using an video diffusion model. With as few as six input keyframes, our method can produce up to 30 second dance videos across a wide range of animals and music tracks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes