CVDec 3, 2024

Diffusion Implicit Policy for Unpaired Scene-aware Motion Synthesis

arXiv:2412.02261v25 citationsh-index: 18Has Code
Originality Highly original
AI Analysis

This addresses the challenge of generalizing motion synthesis to diverse scenes for applications in robotics and animation, representing a novel approach rather than an incremental improvement.

The paper tackles the problem of scene-aware motion synthesis without requiring paired motion-scene data, proposing the Diffusion Implicit Policy (DIP) framework, which achieves better motion naturalness and interaction plausibility than state-of-the-art methods on datasets like PROX and Replica.

Scene-aware motion synthesis has been widely researched recently due to its numerous applications. Prevailing methods rely heavily on paired motion-scene data, while it is difficult to generalize to diverse scenes when trained only on a few specific ones. Thus, we propose a unified framework, termed Diffusion Implicit Policy (DIP), for scene-aware motion synthesis, where paired motion-scene data are no longer necessary. In this paper, we disentangle human-scene interaction from motion synthesis during training, and then introduce an interaction-based implicit policy into motion diffusion during inference. Synthesized motion can be derived through iterative diffusion denoising and implicit policy optimization, thus motion naturalness and interaction plausibility can be maintained simultaneously. For long-term motion synthesis, we introduce motion blending in joint rotation power space. The proposed method is evaluated on synthesized scenes with ShapeNet furniture, and real scenes from PROX and Replica. Results show that our framework presents better motion naturalness and interaction plausibility than cutting-edge methods. This also indicates the feasibility of utilizing the DIP for motion synthesis in more general tasks and versatile scenes. Code will be publicly available at https://github.com/jingyugong/DIP.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes