CVAICLRODec 19, 2023

MotionScript: Natural Language Descriptions for Expressive 3D Human Motions

arXiv:2312.12634v525 citationsh-index: 2IROS
Originality Incremental advance
AI Analysis

This addresses the need for better motion synthesis in animation, virtual simulation, and robotics by creating an interpretable bridge between language and motion, though it appears incremental as it builds on existing text-to-motion models.

The paper tackles the problem of generating detailed natural language descriptions for 3D human motions, introducing MotionScript as a framework that provides fine-grained, structured captions to improve text-to-motion models, resulting in significant improvements in out-of-distribution motion generation.

We introduce MotionScript, a novel framework for generating highly detailed, natural language descriptions of 3D human motions. Unlike existing motion datasets that rely on broad action labels or generic captions, MotionScript provides fine-grained, structured descriptions that capture the full complexity of human movement including expressive actions (e.g., emotions, stylistic walking) and interactions beyond standard motion capture datasets. MotionScript serves as both a descriptive tool and a training resource for text-to-motion models, enabling the synthesis of highly realistic and diverse human motions from text. By augmenting motion datasets with MotionScript captions, we demonstrate significant improvements in out-of-distribution motion generation, allowing large language models (LLMs) to generate motions that extend beyond existing data. Additionally, MotionScript opens new applications in animation, virtual human simulation, and robotics, providing an interpretable bridge between intuitive descriptions and motion synthesis. To the best of our knowledge, this is the first attempt to systematically translate 3D motion into structured natural language without requiring training data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes