CVAug 16, 2021

Learning Skeletal Graph Neural Networks for Hard 3D Pose Estimation

arXiv:2108.07181v2147 citationsHas Code
AI Analysis

This addresses the challenge of accurately estimating 3D poses in difficult scenarios for computer vision applications, representing a strong specific gain rather than a broad breakthrough.

The paper tackles the problem of hard 3D pose estimation with depth ambiguity and occlusion by proposing a skeletal GNN solution, achieving a 10.3% average accuracy improvement on Human3.6M and state-of-the-art performance on action recognition.

Various deep learning techniques have been proposed to solve the single-view 2D-to-3D pose estimation problem. While the average prediction accuracy has been improved significantly over the years, the performance on hard poses with depth ambiguity, self-occlusion, and complex or rare poses is still far from satisfactory. In this work, we target these hard poses and present a novel skeletal GNN learning solution. To be specific, we propose a hop-aware hierarchical channel-squeezing fusion layer to effectively extract relevant information from neighboring nodes while suppressing undesired noises in GNN learning. In addition, we propose a temporal-aware dynamic graph construction procedure that is robust and effective for 3D pose estimation. Experimental results on the Human3.6M dataset show that our solution achieves 10.3\% average prediction accuracy improvement and greatly improves on hard poses over state-of-the-art techniques. We further apply the proposed technique on the skeleton-based action recognition task and also achieve state-of-the-art performance. Our code is available at https://github.com/ailingzengzzz/Skeletal-GNN.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes