CVJan 10, 2024

Diffusion-based Pose Refinement and Muti-hypothesis Generation for 3D Human Pose Estimaiton

arXiv:2401.04921v15 citationsh-index: 5Has CodeICASSP
Originality Incremental advance
AI Analysis

This work addresses accuracy and uncertainty issues in 3D human pose estimation for computer vision applications, representing an incremental improvement through a novel refinement method.

The paper tackles the problem of excessive uncertainty in probabilistic models for 3D human pose estimation, which leads to weak single-hypothesis performance and deviating multi-hypotheses, by proposing a diffusion-based refinement framework called DRPose that refines deterministic model outputs; it achieves state-of-the-art performance on Human3.6M and MPI-INF-3DHP datasets for both single and multi-hypothesis tasks.

Previous probabilistic models for 3D Human Pose Estimation (3DHPE) aimed to enhance pose accuracy by generating multiple hypotheses. However, most of the hypotheses generated deviate substantially from the true pose. Compared to deterministic models, the excessive uncertainty in probabilistic models leads to weaker performance in single-hypothesis prediction. To address these two challenges, we propose a diffusion-based refinement framework called DRPose, which refines the output of deterministic models by reverse diffusion and achieves more suitable multi-hypothesis prediction for the current pose benchmark by multi-step refinement with multiple noises. To this end, we propose a Scalable Graph Convolution Transformer (SGCT) and a Pose Refinement Module (PRM) for denoising and refining. Extensive experiments on Human3.6M and MPI-INF-3DHP datasets demonstrate that our method achieves state-of-the-art performance on both single and multi-hypothesis 3DHPE. Code is available at https://github.com/KHB1698/DRPose.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes