CVJul 29, 2021

Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flows

arXiv:2107.13788v2149 citationsHas Code
Originality Highly original
AI Analysis

This addresses depth ambiguities and occlusions in monocular 3D human pose estimation for computer vision applications, offering a novel probabilistic approach.

The paper tackles the ill-posed problem of 3D human pose estimation from monocular images by generating a diverse set of hypotheses to represent the full posterior distribution, using a normalizing flow-based method that outperforms comparable methods on benchmark datasets Human3.6M and MPI-INF-3DHP.

3D human pose estimation from monocular images is a highly ill-posed problem due to depth ambiguities and occlusions. Nonetheless, most existing works ignore these ambiguities and only estimate a single solution. In contrast, we generate a diverse set of hypotheses that represents the full posterior distribution of feasible 3D poses. To this end, we propose a normalizing flow based method that exploits the deterministic 3D-to-2D mapping to solve the ambiguous inverse 2D-to-3D problem. Additionally, uncertain detections and occlusions are effectively modeled by incorporating uncertainty information of the 2D detector as condition. Further keys to success are a learned 3D pose prior and a generalization of the best-of-M loss. We evaluate our approach on the two benchmark datasets Human3.6M and MPI-INF-3DHP, outperforming all comparable methods in most metrics. The implementation is available on GitHub.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes