CVJan 31, 2025

MotionPCM: Real-Time Motion Synthesis with Phased Consistency Model

arXiv:2501.19083v216.49 citationsh-index: 3

Originality Incremental advance

AI Analysis

This addresses the problem of inefficient real-time motion generation for applications like animation or robotics, representing an incremental improvement over existing consistency models.

The paper tackled the challenge of real-time human motion synthesis by proposing MotionPCM, a phased consistency model that reduces sampling steps, achieving over 30 FPS in a single step and a 38.9% improvement in FID on the HumanML3D dataset.

Diffusion models have become a popular choice for human motion synthesis due to their powerful generative capabilities. However, their high computational complexity and large sampling steps pose challenges for real-time applications. Fortunately, the Consistency Model (CM) provides a solution to greatly reduce the number of sampling steps from hundreds to a few, typically fewer than four, significantly accelerating the synthesis of diffusion models. However, applying CM to text-conditioned human motion synthesis in latent space yields unsatisfactory generation results. In this paper, we introduce \textbf{MotionPCM}, a phased consistency model-based approach designed to improve the quality and efficiency for real-time motion synthesis in latent space. Experimental results on the HumanML3D dataset show that our model achieves real-time inference at over 30 frames per second in a single sampling step while outperforming the previous state-of-the-art with a 38.9\% improvement in FID. The code will be available for reproduction.

View on arXiv PDF

Similar