CVDec 20, 2021

BAPose: Bottom-Up Pose Estimation with Disentangled Waterfall Representations

arXiv:2112.10716v110 citations
Originality Incremental advance
AI Analysis

This addresses pose estimation in challenging crowded environments, which is important for applications like surveillance and sports analysis, though it appears incremental as it builds on existing bottom-up methods.

The paper tackles multi-person pose estimation in crowded scenes with occlusions by proposing BAPose, a bottom-up approach that achieves state-of-the-art results on COCO and CrowdPose datasets with significant accuracy improvements.

We propose BAPose, a novel bottom-up approach that achieves state-of-the-art results for multi-person pose estimation. Our end-to-end trainable framework leverages a disentangled multi-scale waterfall architecture and incorporates adaptive convolutions to infer keypoints more precisely in crowded scenes with occlusions. The multi-scale representations, obtained by the disentangled waterfall module in BAPose, leverage the efficiency of progressive filtering in the cascade architecture, while maintaining multi-scale fields-of-view comparable to spatial pyramid configurations. Our results on the challenging COCO and CrowdPose datasets demonstrate that BAPose is an efficient and robust framework for multi-person pose estimation, achieving significant improvements on state-of-the-art accuracy.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes