CVAug 20, 2024

MPL: Lifting 3D Human Pose from Multi-view 2D Poses

arXiv:2408.10805v16 citationsh-index: 36Has Code
Originality Incremental advance
AI Analysis

This addresses the challenge of generalizing 3D pose estimation to real-world scenarios by leveraging synthetic data, but it is incremental as it builds on existing multi-view and lifting approaches.

The paper tackles the problem of estimating 3D human poses from 2D images by proposing a method that combines 2D pose estimation with a transformer-based network for 2D-to-3D lifting, achieving up to a 45% reduction in MPJPE errors compared to triangulation.

Estimating 3D human poses from 2D images is challenging due to occlusions and projective acquisition. Learning-based approaches have been largely studied to address this challenge, both in single and multi-view setups. These solutions however fail to generalize to real-world cases due to the lack of (multi-view) 'in-the-wild' images paired with 3D poses for training. For this reason, we propose combining 2D pose estimation, for which large and rich training datasets exist, and 2D-to-3D pose lifting, using a transformer-based network that can be trained from synthetic 2D-3D pose pairs. Our experiments demonstrate decreases up to 45% in MPJPE errors compared to the 3D pose obtained by triangulating the 2D poses. The framework's source code is available at https://github.com/aghasemzadeh/OpenMPL .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes