CVLGNov 25, 2021

Rotation Equivariant 3D Hand Mesh Generation from a Single RGB Image

arXiv:2111.13023v1
Originality Highly original
AI Analysis

This addresses the challenge of robust 3D hand reconstruction for applications like VR/AR and robotics, offering a novel approach to improve mesh quality with less training data.

The paper tackles the problem of generating 3D hand meshes from single RGB images by developing a rotation equivariant model that ensures meshes rotate consistently with input image rotations, reducing deformations and data requirements. It outperforms state-of-the-art methods on a real-world dataset, accurately capturing shape and pose under rotation.

We develop a rotation equivariant model for generating 3D hand meshes from 2D RGB images. This guarantees that as the input image of a hand is rotated the generated mesh undergoes a corresponding rotation. Furthermore, this removes undesirable deformations in the meshes often generated by methods without rotation equivariance. By building a rotation equivariant model, through considering symmetries in the problem, we reduce the need for training on very large datasets to achieve good mesh reconstruction. The encoder takes images defined on $\mathbb{Z}^{2}$ and maps these to latent functions defined on the group $C_{8}$. We introduce a novel vector mapping function to map the function defined on $C_{8}$ to a latent point cloud space defined on the group $\mathrm{SO}(2)$. Further, we introduce a 3D projection function that learns a 3D function from the $\mathrm{SO}(2)$ latent space. Finally, we use an $\mathrm{SO}(3)$ equivariant decoder to ensure rotation equivariance. Our rotation equivariant model outperforms state-of-the-art methods on a real-world dataset and we demonstrate that it accurately captures the shape and pose in the generated meshes under rotation of the input hand.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes