CVMar 20, 2024

CoMo: Controllable Motion Generation through Language Guided Pose Code Editing

arXiv:2403.13900v272 citationsh-index: 67ECCV
Originality Incremental advance
AI Analysis

This addresses the challenge of modifying subtle postures or inserting actions in motion generation for applications like animation or robotics, representing an incremental improvement over existing methods.

The paper tackles the problem of fine-grained controllability in text-to-motion generation by introducing CoMo, which uses pose codes and LLMs to edit motions, achieving competitive generation performance and surpassing previous work in editing abilities in human studies.

Text-to-motion models excel at efficient human motion generation, but existing approaches lack fine-grained controllability over the generation process. Consequently, modifying subtle postures within a motion or inserting new actions at specific moments remains a challenge, limiting the applicability of these methods in diverse scenarios. In light of these challenges, we introduce CoMo, a Controllable Motion generation model, adept at accurately generating and editing motions by leveraging the knowledge priors of large language models (LLMs). Specifically, CoMo decomposes motions into discrete and semantically meaningful pose codes, with each code encapsulating the semantics of a body part, representing elementary information such as "left knee slightly bent". Given textual inputs, CoMo autoregressively generates sequences of pose codes, which are then decoded into 3D motions. Leveraging pose codes as interpretable representations, an LLM can directly intervene in motion editing by adjusting the pose codes according to editing instructions. Experiments demonstrate that CoMo achieves competitive performance in motion generation compared to state-of-the-art models while, in human studies, CoMo substantially surpasses previous work in motion editing abilities.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes