CVMar 17, 2020

Augmented Parallel-Pyramid Net for Attention Guided Pose-Estimation

arXiv:2003.07516v14 citations
AI Analysis

This addresses pose estimation for computer vision applications with incremental improvements.

The paper tackles human pose estimation by proposing an augmented parallel-pyramid net with attention modules and differentiable auto-data augmentation, achieving top-1 accuracy on the COCO keypoint benchmark and state-of-the-art results on MPII datasets.

The target of human pose estimation is to determine body part or joint locations of each person from an image. This is a challenging problems with wide applications. To address this issue, this paper proposes an augmented parallel-pyramid net with attention partial module and differentiable auto-data augmentation. Technically, a parallel pyramid structure is proposed to compensate the loss of information. We take the design of parallel structure for reverse compensation. Meanwhile, the overall computational complexity does not increase. We further define an Attention Partial Module (APM) operator to extract weighted features from different scale feature maps generated by the parallel pyramid structure. Compared with refining through upsampling operator, APM can better capture the relationship between channels. At last, we proposed a differentiable auto data augmentation method to further improve estimation accuracy. We define a new pose search space where the sequences of data augmentations are formulated as a trainable and operational CNN component. Experiments corroborate the effectiveness of our proposed method. Notably, our method achieves the top-1 accuracy on the challenging COCO keypoint benchmark and the state-of-the-art results on the MPII datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes