CVNov 25, 2017

Learning 3D Human Pose from Structure and Motion

arXiv:1711.09250v2107 citations
Originality Incremental advance
AI Analysis

It addresses the problem of accurate 3D human pose estimation for computer vision applications, with incremental improvements over existing methods.

The paper tackles 3D human pose estimation from single images in in-the-wild settings by proposing anatomically inspired loss functions and a weakly-supervised learning framework, improving state-of-the-art by 11.8% on Human3.6M and 12% on MPI-INF-3DHP while running at 30 FPS.

3D human pose estimation from a single image is a challenging problem, especially for in-the-wild settings due to the lack of 3D annotated data. We propose two anatomically inspired loss functions and use them with a weakly-supervised learning framework to jointly learn from large-scale in-the-wild 2D and indoor/synthetic 3D data. We also present a simple temporal network that exploits temporal and structural cues present in predicted pose sequences to temporally harmonize the pose estimations. We carefully analyze the proposed contributions through loss surface visualizations and sensitivity analysis to facilitate deeper understanding of their working mechanism. Our complete pipeline improves the state-of-the-art by 11.8% and 12% on Human3.6M and MPI-INF-3DHP, respectively, and runs at 30 FPS on a commodity graphics card.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes