CVApr 29, 2020

Motion Guided 3D Pose Estimation from Videos

Jingbo Wang, Sijie Yan, Yuanjun Xiong, Dahua Lin

arXiv:2004.13985v124.4246 citations

Originality Incremental advance

AI Analysis

This work addresses 3D pose estimation for computer vision applications, offering incremental improvements through a novel loss and network design.

The paper tackles monocular 3D human pose estimation from videos by introducing a motion loss and a U-shaped GCN architecture, achieving state-of-the-art results on benchmarks like Human3.6M and MPI-INF-3DHP with improved smoothness and motion recovery.

We propose a new loss function, called motion loss, for the problem of monocular 3D Human pose estimation from 2D pose. In computing motion loss, a simple yet effective representation for keypoint motion, called pairwise motion encoding, is introduced. We design a new graph convolutional network architecture, U-shaped GCN (UGCN). It captures both short-term and long-term motion information to fully leverage the additional supervision from the motion loss. We experiment training UGCN with the motion loss on two large scale benchmarks: Human3.6M and MPI-INF-3DHP. Our model surpasses other state-of-the-art models by a large margin. It also demonstrates strong capacity in producing smooth 3D sequences and recovering keypoint motion.

View on arXiv PDF

Similar