CVGRSep 1, 2022

FLAME: Free-form Language-based Motion Synthesis & Editing

arXiv:2209.00349v2283 citationsh-index: 20
Originality Incremental advance
AI Analysis

This work addresses the need for automated motion synthesis in industries like gaming, animation, and robotics, representing an incremental improvement over existing methods.

The authors tackled the problem of generating and editing human motions from free-form text descriptions, achieving state-of-the-art performance on three text-motion datasets: HumanML3D, BABEL, and KIT.

Text-based motion generation models are drawing a surge of interest for their potential for automating the motion-making process in the game, animation, or robot industries. In this paper, we propose a diffusion-based motion synthesis and editing model named FLAME. Inspired by the recent successes in diffusion models, we integrate diffusion-based generative models into the motion domain. FLAME can generate high-fidelity motions well aligned with the given text. Also, it can edit the parts of the motion, both frame-wise and joint-wise, without any fine-tuning. FLAME involves a new transformer-based architecture we devise to better handle motion data, which is found to be crucial to manage variable-length motions and well attend to free-form text. In experiments, we show that FLAME achieves state-of-the-art generation performances on three text-motion datasets: HumanML3D, BABEL, and KIT. We also demonstrate that editing capability of FLAME can be extended to other tasks such as motion prediction or motion in-betweening, which have been previously covered by dedicated models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes