CVAIMar 6, 2025

How to Move Your Dragon: Text-to-Motion Synthesis for Large-Vocabulary Objects

arXiv:2503.04257v24 citationsh-index: 6ICML
Originality Incremental advance
AI Analysis

This work solves the problem of creating 3D content for a wide range of objects with varying skeletal structures, though it is incremental as it builds on existing motion diffusion models.

The paper tackled motion synthesis for diverse object categories by addressing dataset and method limitations, resulting in a method that generates high-fidelity motions from text for diverse and unseen objects.

Motion synthesis for diverse object categories holds great potential for 3D content creation but remains underexplored due to two key challenges: (1) the lack of comprehensive motion datasets that include a wide range of high-quality motions and annotations, and (2) the absence of methods capable of handling heterogeneous skeletal templates from diverse objects. To address these challenges, we contribute the following: First, we augment the Truebones Zoo dataset, a high-quality animal motion dataset covering over 70 species, by annotating it with detailed text descriptions, making it suitable for text-based motion synthesis. Second, we introduce rig augmentation techniques that generate diverse motion data while preserving consistent dynamics, enabling models to adapt to various skeletal configurations. Finally, we redesign existing motion diffusion models to dynamically adapt to arbitrary skeletal templates, enabling motion synthesis for a diverse range of objects with varying structures. Experiments show that our method learns to generate high-fidelity motions from textual descriptions for diverse and even unseen objects, setting a strong foundation for motion synthesis across diverse object categories and skeletal templates. Qualitative results are available at: $\href{https://t2m4lvo.github.io}{https://t2m4lvo.github.io}$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes