CV LGNov 25, 2021

Rotation Equivariant 3D Hand Mesh Generation from a Single RGB Image

Joshua Mitton, Chaitanya Kaul, Roderick Murray-Smith

arXiv:2111.13023v11.4

Originality Highly original

AI Analysis

This addresses the challenge of robust 3D hand reconstruction for applications like VR/AR and robotics, offering a novel approach to improve mesh quality with less training data.

The paper tackles the problem of generating 3D hand meshes from single RGB images by developing a rotation equivariant model that ensures meshes rotate consistently with input image rotations, reducing deformations and data requirements. It outperforms state-of-the-art methods on a real-world dataset, accurately capturing shape and pose under rotation.

We develop a rotation equivariant model for generating 3D hand meshes from 2D RGB images. This guarantees that as the input image of a hand is rotated the generated mesh undergoes a corresponding rotation. Furthermore, this removes undesirable deformations in the meshes often generated by methods without rotation equivariance. By building a rotation equivariant model, through considering symmetries in the problem, we reduce the need for training on very large datasets to achieve good mesh reconstruction. The encoder takes images defined on $\mathbb{Z}^{2}$ and maps these to latent functions defined on the group $C_{8}$. We introduce a novel vector mapping function to map the function defined on $C_{8}$ to a latent point cloud space defined on the group $\mathrm{SO}(2)$. Further, we introduce a 3D projection function that learns a 3D function from the $\mathrm{SO}(2)$ latent space. Finally, we use an $\mathrm{SO}(3)$ equivariant decoder to ensure rotation equivariance. Our rotation equivariant model outperforms state-of-the-art methods on a real-world dataset and we demonstrate that it accurately captures the shape and pose in the generated meshes under rotation of the input hand.

View on arXiv PDF

Similar