CV MMOct 2, 2020

MM-Hand: 3D-Aware Multi-Modal Guided Hand Generative Network for 3D Hand Pose Synthesis

Zhenyu Wu, Duc Hoang, Shih-Yao Lin, Yusheng Xie, Liangjian Chen, Yen-Yu Lin, Zhangyang Wang, Wei Fan

arXiv:2010.01158v17.913 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the problem of costly data annotation for 3D hand pose estimation, offering a data augmentation solution for researchers and practitioners in computer vision.

The paper tackles the challenge of expensive 3D hand pose annotation by developing MM-Hand, a generative network that synthesizes realistic and diverse hand images guided by 3D pose, which outperforms existing methods and improves state-of-the-art 3D hand pose estimators on benchmark datasets.

Estimating the 3D hand pose from a monocular RGB image is important but challenging. A solution is training on large-scale RGB hand images with accurate 3D hand keypoint annotations. However, it is too expensive in practice. Instead, we have developed a learning-based approach to synthesize realistic, diverse, and 3D pose-preserving hand images under the guidance of 3D pose information. We propose a 3D-aware multi-modal guided hand generative network (MM-Hand), together with a novel geometry-based curriculum learning strategy. Our extensive experimental results demonstrate that the 3D-annotated images generated by MM-Hand qualitatively and quantitatively outperform existing options. Moreover, the augmented data can consistently improve the quantitative performance of the state-of-the-art 3D hand pose estimators on two benchmark datasets. The code will be available at https://github.com/ScottHoang/mm-hand.

View on arXiv PDF Code

Similar