Skeleton-aware multi-scale heatmap regression for 2D hand pose estimation
This work addresses hand pose estimation for computer vision applications, but it is incremental as it builds on existing methods with a new constraint and dataset.
The paper tackles the problem of 2D hand pose estimation from RGB images by proposing a framework that uses hand skeleton detection and multi-scale heatmap regression to handle varying hand sizes, resulting in state-of-the-art performance on two datasets with improved recovery in cluttered and complex scenarios.
Existing RGB-based 2D hand pose estimation methods learn the joint locations from a single resolution, which is not suitable for different hand sizes. To tackle this problem, we propose a new deep learning-based framework that consists of two main modules. The former presents a segmentation-based approach to detect the hand skeleton and localize the hand bounding box. The second module regresses the 2D joint locations through a multi-scale heatmap regression approach that exploits the predicted hand skeleton as a constraint to guide the model. Furthermore, we construct a new dataset that is suitable for both hand detection and pose estimation. We qualitatively and quantitatively validate our method on two datasets. Results demonstrate that the proposed method outperforms state-of-the-art and can recover the pose even in cluttered images and complex poses.