CVNov 20, 2015

DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation

arXiv:1511.06645v21021 citationsHas Code
Originality Highly original
AI Analysis

This work addresses the problem of accurately estimating articulated human poses in complex, real-world images with multiple people, which is crucial for applications like surveillance and human-computer interaction, representing a novel joint formulation rather than an incremental improvement.

The paper tackles multi-person pose estimation by jointly solving detection and pose estimation, outperforming previous methods that handle these tasks sequentially. It achieves state-of-the-art results on four datasets, with models and code made publicly available.

This paper considers the task of articulated human pose estimation of multiple people in real world images. We propose an approach that jointly solves the tasks of detection and pose estimation: it infers the number of persons in a scene, identifies occluded body parts, and disambiguates body parts between people in close proximity of each other. This joint formulation is in contrast to previous strategies, that address the problem by first detecting people and subsequently estimating their body pose. We propose a partitioning and labeling formulation of a set of body-part hypotheses generated with CNN-based part detectors. Our formulation, an instance of an integer linear program, implicitly performs non-maximum suppression on the set of part candidates and groups them to form configurations of body parts respecting geometric and appearance constraints. Experiments on four different datasets demonstrate state-of-the-art results for both single person and multi person pose estimation. Models and code available at http://pose.mpi-inf.mpg.de.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes