CVJul 16, 2024

Length-Aware Motion Synthesis via Latent Diffusion

Alessio Sampieri, Alessio Palma, Indro Spinelli, Fabio Galasso

arXiv:2407.11532v111.314 citationsh-index: 25Has Code

Originality Incremental advance

AI Analysis

This addresses a critical limitation in human motion synthesis for applications requiring variable-length animations, though it appears incremental as it builds on existing diffusion and VAE methods.

The paper tackles the problem of generating 3D human motion sequences with precise control over target duration, introducing a novel model called Length-Aware Latent Diffusion (LADiff) that significantly improves state-of-the-art performance on benchmarks like HumanML3D and KIT-ML.

The target duration of a synthesized human motion is a critical attribute that requires modeling control over the motion dynamics and style. Speeding up an action performance is not merely fast-forwarding it. However, state-of-the-art techniques for human behavior synthesis have limited control over the target sequence length. We introduce the problem of generating length-aware 3D human motion sequences from textual descriptors, and we propose a novel model to synthesize motions of variable target lengths, which we dub "Length-Aware Latent Diffusion" (LADiff). LADiff consists of two new modules: 1) a length-aware variational auto-encoder to learn motion representations with length-dependent latent codes; 2) a length-conforming latent diffusion model to generate motions with a richness of details that increases with the required target sequence length. LADiff significantly improves over the state-of-the-art across most of the existing motion synthesis metrics on the two established benchmarks of HumanML3D and KIT-ML.

View on arXiv PDF Code

Similar