CVOct 6, 2017

Human Pose Regression by Combining Indirect Part Detection and Contextual Information

arXiv:1710.02322v1265 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses human pose estimation for computer vision applications, offering a fully differentiable framework that integrates contextual information seamlessly, representing an incremental improvement over existing regression methods.

The paper tackled human pose estimation from still images by proposing an end-to-end trainable regression approach using a Soft-argmax function to convert feature maps to joint coordinates, achieving best performance among regression methods and comparable results to state-of-the-art detection-based approaches on LSP and MPII datasets.

In this paper, we propose an end-to-end trainable regression approach for human pose estimation from still images. We use the proposed Soft-argmax function to convert feature maps directly to joint coordinates, resulting in a fully differentiable framework. Our method is able to learn heat maps representations indirectly, without additional steps of artificial ground truth generation. Consequently, contextual information can be included to the pose predictions in a seamless way. We evaluated our method on two very challenging datasets, the Leeds Sports Poses (LSP) and the MPII Human Pose datasets, reaching the best performance among all the existing regression methods and comparable results to the state-of-the-art detection based approaches.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes