CVAIMay 2, 2025

TSTMotion: Training-free Scene-aware Text-to-motion Generation

arXiv:2505.01182v25 citationsh-index: 27Has CodeICME
Originality Incremental advance
AI Analysis

This addresses the challenge of scene-aware text-to-motion generation for applications in animation and virtual reality, offering a more practical solution by avoiding costly data collection, though it is incremental as it builds on existing pre-trained models.

The paper tackles the problem of generating human motions in 3D scenes from text descriptions without requiring expensive training data, proposing a training-free framework that adapts pre-trained motion generators to achieve scene-aware motion generation with demonstrated efficacy and generalizability.

Text-to-motion generation has recently garnered significant research interest, primarily focusing on generating human motion sequences in blank backgrounds. However, human motions commonly occur within diverse 3D scenes, which has prompted exploration into scene-aware text-to-motion generation methods. Yet, existing scene-aware methods often rely on large-scale ground-truth motion sequences in diverse 3D scenes, which poses practical challenges due to the expensive cost. To mitigate this challenge, we are the first to propose a \textbf{T}raining-free \textbf{S}cene-aware \textbf{T}ext-to-\textbf{Motion} framework, dubbed as \textbf{TSTMotion}, that efficiently empowers pre-trained blank-background motion generators with the scene-aware capability. Specifically, conditioned on the given 3D scene and text description, we adopt foundation models together to reason, predict and validate a scene-aware motion guidance. Then, the motion guidance is incorporated into the blank-background motion generators with two modifications, resulting in scene-aware text-driven motion sequences. Extensive experiments demonstrate the efficacy and generalizability of our proposed framework. We release our code in \href{https://tstmotion.github.io/}{Project Page}.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes