CVDec 19, 2020

GlocalNet: Class-aware Long-term Human Motion Synthesis

arXiv:2012.10744v116 citations
AI Analysis

This work addresses the problem of generating realistic, long-term human motion for applications like AR and 3D animation, which is a significant challenge for animators and developers.

This paper tackles the challenge of synthesizing long-term human motion trajectories (over 6000 ms) across more than 50 human activity classes. The authors propose a two-stage method that first learns sparse global pose dependencies and then generates dense motion trajectories, outperforming state-of-the-art methods on public datasets.

Synthesis of long-term human motion skeleton sequences is essential to aid human-centric video generation with potential applications in Augmented Reality, 3D character animations, pedestrian trajectory prediction, etc. Long-term human motion synthesis is a challenging task due to multiple factors like, long-term temporal dependencies among poses, cyclic repetition across poses, bi-directional and multi-scale dependencies among poses, variable speed of actions, and a large as well as partially overlapping space of temporal pose variations across multiple class/types of human activities. This paper aims to address these challenges to synthesize a long-term (> 6000 ms) human motion trajectory across a large variety of human activity classes (>50). We propose a two-stage activity generation method to achieve this goal, where the first stage deals with learning the long-term global pose dependencies in activity sequences by learning to synthesize a sparse motion trajectory while the second stage addresses the generation of dense motion trajectories taking the output of the first stage. We demonstrate the superiority of the proposed method over SOTA methods using various quantitative evaluation metrics on publicly available datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes