Human Motion Synthesis_ A Diffusion Approach for Motion Stitching and In-Betweening
This work addresses the problem of generating realistic human motion sequences for applications in animation or robotics, though it appears incremental as it builds on existing diffusion and transformer methods.
The paper tackled motion stitching and in-betweening for human motion generation by proposing a diffusion model with a transformer-based denoiser, achieving strong performance in generating smooth 5-second sequences of 75 frames at 15 fps.
Human motion generation is an important area of research in many fields. In this work, we tackle the problem of motion stitching and in-betweening. Current methods either require manual efforts, or are incapable of handling longer sequences. To address these challenges, we propose a diffusion model with a transformer-based denoiser to generate realistic human motion. Our method demonstrated strong performance in generating in-betweening sequences, transforming a variable number of input poses into smooth and realistic motion sequences consisting of 75 frames at 15 fps, resulting in a total duration of 5 seconds. We present the performance evaluation of our method using quantitative metrics such as Frechet Inception Distance (FID), Diversity, and Multimodality, along with visual assessments of the generated outputs.