CVApr 7, 2024

Joint Reconstruction of 3D Human and Object via Contact-Based Refinement Transformer

arXiv:2404.04819v130 citationsh-index: 23Has CodeCVPR
Originality Highly original
AI Analysis

This work addresses the challenge of accurately modeling physical interactions in 3D scene understanding, which is incremental as it builds on existing reconstruction methods by incorporating contact cues.

The paper tackles the problem of joint 3D human and object reconstruction from a single image by leveraging human-object contact information, achieving state-of-the-art performance in both contact estimation and reconstruction tasks.

Human-object contact serves as a strong cue to understand how humans physically interact with objects. Nevertheless, it is not widely explored to utilize human-object contact information for the joint reconstruction of 3D human and object from a single image. In this work, we present a novel joint 3D human-object reconstruction method (CONTHO) that effectively exploits contact information between humans and objects. There are two core designs in our system: 1) 3D-guided contact estimation and 2) contact-based 3D human and object refinement. First, for accurate human-object contact estimation, CONTHO initially reconstructs 3D humans and objects and utilizes them as explicit 3D guidance for contact estimation. Second, to refine the initial reconstructions of 3D human and object, we propose a novel contact-based refinement Transformer that effectively aggregates human features and object features based on the estimated human-object contact. The proposed contact-based refinement prevents the learning of erroneous correlation between human and object, which enables accurate 3D reconstruction. As a result, our CONTHO achieves state-of-the-art performance in both human-object contact estimation and joint reconstruction of 3D human and object. The code is publicly available at https://github.com/dqj5182/CONTHO_RELEASE.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes