CVJun 15, 2021

Real-time Pose and Shape Reconstruction of Two Interacting Hands With a Single Depth Camera

arXiv:2106.08059v1178 citations
AI Analysis

This enables marker-less, real-time tracking of two hands for applications like VR/AR, addressing a domain-specific problem with incremental improvements in handling interactions.

The paper tackles real-time pose and shape reconstruction of two interacting hands using a single depth camera, achieving state-of-the-art results in complex scenes like tight grasps and occlusions with real-time performance.

We present a novel method for real-time pose and shape reconstruction of two strongly interacting hands. Our approach is the first two-hand tracking solution that combines an extensive list of favorable properties, namely it is marker-less, uses a single consumer-level depth camera, runs in real time, handles inter- and intra-hand collisions, and automatically adjusts to the user's hand shape. In order to achieve this, we embed a recent parametric hand pose and shape model and a dense correspondence predictor based on a deep neural network into a suitable energy minimization framework. For training the correspondence prediction network, we synthesize a two-hand dataset based on physical simulations that includes both hand pose and shape annotations while at the same time avoiding inter-hand penetrations. To achieve real-time rates, we phrase the model fitting in terms of a nonlinear least-squares problem so that the energy can be optimized based on a highly efficient GPU-based Gauss-Newton optimizer. We show state-of-the-art results in scenes that exceed the complexity level demonstrated by previous work, including tight two-hand grasps, significant inter-hand occlusions, and gesture interaction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes