Rotation-invariant Mixed Graphical Model Network for 2D Hand Pose Estimation
This addresses hand pose estimation for computer vision applications, offering an incremental improvement over existing methods.
The paper tackles 2D hand pose estimation from monocular RGB images by proposing the Rotation-invariant Mixed Graphical Model Network (R-MGMN), which integrates rotation invariance and a pool of graphical models to generate confidence maps, and it outperforms the state-of-the-art by a noticeable margin on two public datasets.
In this paper, we propose a new architecture named Rotation-invariant Mixed Graphical Model Network (R-MGMN) to solve the problem of 2D hand pose estimation from a monocular RGB image. By integrating a rotation net, the R-MGMN is invariant to rotations of the hand in the image. It also has a pool of graphical models, from which a combination of graphical models could be selected, conditioning on the input image. Belief propagation is performed on each graphical model separately, generating a set of marginal distributions, which are taken as the confidence maps of hand keypoint positions. Final confidence maps are obtained by aggregating these confidence maps together. We evaluate the R-MGMN on two public hand pose datasets. Experiment results show our model outperforms the state-of-the-art algorithm which is widely used in 2D hand pose estimation by a noticeable margin.