CVMay 3, 2019

Anti-Confusing: Region-Aware Network for Human Pose Estimation

arXiv:1905.00996v27 citations
Originality Incremental advance
AI Analysis

This addresses the problem of accurate human pose estimation in complex real-world scenarios for computer vision applications, representing an incremental advance with novel components.

The paper tackles the problem of human pose estimation under challenging conditions like occlusion and symmetric appearance by proposing a Region-Aware Network (RANet), which achieves state-of-the-art results on benchmarks such as MPII and LSP, with significant improvements on easily-confusable joints.

In this work, we propose a novel framework named Region-Aware Network (RANet), which learns the ability of anti-confusing in case of heavy occlusion, nearby person and symmetric appearance, for human pose estimation. Specifically, the proposed method addresses three key aspects, i.e., data augmentation, feature learning and prediction fusion, respectively. First, we propose Parsing-based Data Augmentation (PDA) to generate abundant data that synthesizes confusing textures. Second, we not only propose a Feature Pyramid Stem (FPS) to learn stronger low-level features in lower stage; but also incorporate an Effective Region Extraction (ERE) module to excavate better target-specific features. Third, we introduce Cascade Voting Fusion (CVF) to explicitly exclude the inferior predictions and fuse the rest effective predictions for the final pose estimation. Extensive experimental results on two popular benchmarks, i.e. MPII and LSP, demonstrate the effectiveness of our method against the state-of-the-art competitors. Especially on easily-confusable joints, our method makes significant improvement.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes