CVAIJun 29, 2023

Learning Structure-Guided Diffusion Model for 2D Human Pose Estimation

arXiv:2306.17074v117 citationsh-index: 60
Originality Incremental advance
AI Analysis

This addresses pose estimation for computer vision applications, offering a novel scheme with incremental performance gains.

The paper tackles 2D human pose estimation by formulating it as a heatmap generation problem using a diffusion model, achieving improvements of 1.6, 1.2, and 1.2 mAP on COCO, CrowdPose, and AI Challenge datasets.

One of the mainstream schemes for 2D human pose estimation (HPE) is learning keypoints heatmaps by a neural network. Existing methods typically improve the quality of heatmaps by customized architectures, such as high-resolution representation and vision Transformers. In this paper, we propose \textbf{DiffusionPose}, a new scheme that formulates 2D HPE as a keypoints heatmaps generation problem from noised heatmaps. During training, the keypoints are diffused to random distribution by adding noises and the diffusion model learns to recover ground-truth heatmaps from noised heatmaps with respect to conditions constructed by image feature. During inference, the diffusion model generates heatmaps from initialized heatmaps in a progressive denoising way. Moreover, we further explore improving the performance of DiffusionPose with conditions from human structural information. Extensive experiments show the prowess of our DiffusionPose, with improvements of 1.6, 1.2, and 1.2 mAP on widely-used COCO, CrowdPose, and AI Challenge datasets, respectively.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes