Denoising and Selecting Pseudo-Heatmaps for Semi-Supervised Human Pose Estimation
This work addresses the challenge of limited labeled data for human pose estimation, offering a significant performance boost in low-data regimes.
The paper tackles the problem of semi-supervised human pose estimation by proposing a method that denoises and selects pseudo-heatmaps, resulting in a 7.22 mAP improvement over competitors with only 0.5K labeled images.
We propose a new semi-supervised learning design for human pose estimation that revisits the popular dual-student framework and enhances it two ways. First, we introduce a denoising scheme to generate reliable pseudo-heatmaps as targets for learning from unlabeled data. This uses multi-view augmentations and a threshold-and-refine procedure to produce a pool of pseudo-heatmaps. Second, we select the learning targets from these pseudo-heatmaps guided by the estimated cross-student uncertainty. We evaluate our proposed method on multiple evaluation setups on the COCO benchmark. Our results show that our model outperforms previous state-of-the-art semi-supervised pose estimators, especially in extreme low-data regime. For example with only 0.5K labeled images our method is capable of surpassing the best competitor by 7.22 mAP (+25% absolute improvement). We also demonstrate that our model can learn effectively from unlabeled data in the wild to further boost its generalization and performance.