CVNov 18, 2019

DirectPose: Direct End-to-End Multi-Person Pose Estimation

arXiv:1911.07451v2116 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of simplifying and improving efficiency in human pose estimation for computer vision applications, though it is incremental as it builds on anchor-free object detection ideas.

The authors tackled the problem of multi-person pose estimation by introducing DirectPose, the first direct end-to-end framework that predicts instance-aware keypoints from raw images, eliminating heuristic grouping or bounding-box detection. The framework, enhanced with a Keypoint Alignment mechanism, achieves competitive or better performance than previous baselines in both bottom-up and top-down methods.

We propose the first direct end-to-end multi-person pose estimation framework, termed DirectPose. Inspired by recent anchor-free object detectors, which directly regress the two corners of target bounding-boxes, the proposed framework directly predicts instance-aware keypoints for all the instances from a raw input image, eliminating the need for heuristic grouping in bottom-up methods or bounding-box detection and RoI operations in top-down ones. We also propose a novel Keypoint Alignment (KPAlign) mechanism, which overcomes the main difficulty: lack of the alignment between the convolutional features and predictions in this end-to-end framework. KPAlign improves the framework's performance by a large margin while still keeping the framework end-to-end trainable. With the only postprocessing non-maximum suppression (NMS), our proposed framework can detect multi-person keypoints with or without bounding-boxes in a single shot. Experiments demonstrate that the end-to-end paradigm can achieve competitive or better performance than previous strong baselines, in both bottom-up and top-down methods. We hope that our end-to-end approach can provide a new perspective for the human pose estimation task.

Code Implementations9 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes