CVJun 23, 2016

Robust 3D Hand Pose Estimation in Single Depth Images: from Single-View CNN to Multi-View CNNs

arXiv:1606.07253v3296 citations
Originality Incremental advance
AI Analysis

This improves accuracy for human-computer interaction applications, but it is an incremental advance over existing discriminative methods.

The paper tackles 3D hand pose estimation from single depth images by projecting the image onto three orthogonal planes to regress 2D heat-maps, which are fused with learned priors for final 3D estimation. It largely outperforms state-of-the-art methods on a challenging dataset and shows good generalization in cross-dataset experiments.

Articulated hand pose estimation plays an important role in human-computer interaction. Despite the recent progress, the accuracy of existing methods is still not satisfactory, partially due to the difficulty of embedded high-dimensional and non-linear regression problem. Different from the existing discriminative methods that regress for the hand pose with a single depth image, we propose to first project the query depth image onto three orthogonal planes and utilize these multi-view projections to regress for 2D heat-maps which estimate the joint positions on each plane. These multi-view heat-maps are then fused to produce final 3D hand pose estimation with learned pose priors. Experiments show that the proposed method largely outperforms state-of-the-art on a challenging dataset. Moreover, a cross-dataset experiment also demonstrates the good generalization ability of the proposed method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes