CVApr 14, 2025

Benchmarking 3D Human Pose Estimation Models under Occlusions

arXiv:2504.10350v26 citationsh-index: 1
Originality Synthesis-oriented
AI Analysis

It addresses the problem of occlusion robustness in 3D human pose estimation for real-world applications, but is incremental as it focuses on benchmarking existing models without proposing new solutions.

This paper benchmarks the robustness of nine state-of-the-art 3D human pose estimation models under realistic occlusion conditions, finding that all models show significant performance degradation, with diffusion-based models underperforming and distal joints being consistently vulnerable.

Human Pose Estimation (HPE) involves detecting and localizing keypoints on the human body from visual data. In 3D HPE, occlusions, where parts of the body are not visible in the image, pose a significant challenge for accurate pose reconstruction. This paper presents a benchmark on the robustness of 3D HPE models under realistic occlusion conditions, involving combinations of occluded keypoints commonly observed in real-world scenarios. We evaluate nine state-of-the-art 2D-to-3D HPE models, spanning convolutional, transformer-based, graph-based, and diffusion-based architectures, using the BlendMimic3D dataset, a synthetic dataset with ground-truth 2D/3D annotations and occlusion labels. All models were originally trained on Human3.6M and tested here without retraining to assess their generalization. We introduce a protocol that simulates occlusion by adding noise into 2D keypoints based on real detector behavior, and conduct both global and per-joint sensitivity analyses. Our findings reveal that all models exhibit notable performance degradation under occlusion, with diffusion-based models underperforming despite their stochastic nature. Additionally, a per-joint occlusion analysis identifies consistent vulnerability in distal joints (e.g., wrists, feet) across models. Overall, this work highlights critical limitations of current 3D HPE models in handling occlusions, and provides insights for improving real-world robustness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes