CVJul 16, 2024

Length-Aware Motion Synthesis via Latent Diffusion

arXiv:2407.11532v114 citationsh-index: 10
Originality Incremental advance
AI Analysis

This addresses a critical limitation in human motion synthesis for applications requiring variable-length animations, though it appears incremental as it builds on existing diffusion and VAE methods.

The paper tackles the problem of generating 3D human motion sequences with precise control over target duration, introducing a novel model called Length-Aware Latent Diffusion (LADiff) that significantly improves state-of-the-art performance on benchmarks like HumanML3D and KIT-ML.

The target duration of a synthesized human motion is a critical attribute that requires modeling control over the motion dynamics and style. Speeding up an action performance is not merely fast-forwarding it. However, state-of-the-art techniques for human behavior synthesis have limited control over the target sequence length. We introduce the problem of generating length-aware 3D human motion sequences from textual descriptors, and we propose a novel model to synthesize motions of variable target lengths, which we dub "Length-Aware Latent Diffusion" (LADiff). LADiff consists of two new modules: 1) a length-aware variational auto-encoder to learn motion representations with length-dependent latent codes; 2) a length-conforming latent diffusion model to generate motions with a richness of details that increases with the required target sequence length. LADiff significantly improves over the state-of-the-art across most of the existing motion synthesis metrics on the two established benchmarks of HumanML3D and KIT-ML.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes