CVAug 8, 2023

Exploiting Spatial-Temporal Context for Interacting Hand Reconstruction on Monocular RGB Video

Weichao Zhao, Hezhen Hu, Wengang Zhou, Li li, Houqiang Li

arXiv:2308.04074v33.95 citationsh-index: 68

Originality Incremental advance

AI Analysis

This work addresses the problem of accurate 3D hand reconstruction for applications in human-computer interaction and virtual reality, representing an incremental improvement over previous methods.

The paper tackles the problem of reconstructing interacting hands from monocular RGB video, which is challenging due to occlusions and similar textures, by exploiting spatial-temporal context to achieve new state-of-the-art performance on public benchmarks.

Reconstructing interacting hands from monocular RGB data is a challenging task, as it involves many interfering factors, e.g. self- and mutual occlusion and similar textures. Previous works only leverage information from a single RGB image without modeling their physically plausible relation, which leads to inferior reconstruction results. In this work, we are dedicated to explicitly exploiting spatial-temporal information to achieve better interacting hand reconstruction. On one hand, we leverage temporal context to complement insufficient information provided by the single frame, and design a novel temporal framework with a temporal constraint for interacting hand motion smoothness. On the other hand, we further propose an interpenetration detection module to produce kinetically plausible interacting hands without physical collisions. Extensive experiments are performed to validate the effectiveness of our proposed framework, which achieves new state-of-the-art performance on public benchmarks.

View on arXiv PDF

Similar