CVNov 30, 2020

CanonPose: Self-Supervised Monocular 3D Human Pose Estimation in the Wild

arXiv:2011.14679v1123 citations
AI Analysis

This work is significant for researchers and practitioners in computer vision who need to estimate 3D human pose in scenarios where traditional motion capture data is difficult or impossible to acquire, such as outdoor sports, by removing the need for labeled data and calibrated cameras.

This paper tackles the problem of 3D human pose estimation from single images without requiring labeled training data. It achieves this by using a self-supervised approach that disentangles 2D pose into 3D pose and camera rotation from unlabeled multi-view data, even with moving cameras. The method is evaluated on Human3.6M, MPII-INF-3DHP, and SkiPose datasets.

Human pose estimation from single images is a challenging problem in computer vision that requires large amounts of labeled training data to be solved accurately. Unfortunately, for many human activities (\eg outdoor sports) such training data does not exist and is hard or even impossible to acquire with traditional motion capture systems. We propose a self-supervised approach that learns a single image 3D pose estimator from unlabeled multi-view data. To this end, we exploit multi-view consistency constraints to disentangle the observed 2D pose into the underlying 3D pose and camera rotation. In contrast to most existing methods, we do not require calibrated cameras and can therefore learn from moving cameras. Nevertheless, in the case of a static camera setup, we present an optional extension to include constant relative camera rotations over multiple views into our framework. Key to the success are new, unbiased reconstruction objectives that mix information across views and training samples. The proposed approach is evaluated on two benchmark datasets (Human3.6M and MPII-INF-3DHP) and on the in-the-wild SkiPose dataset.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes