CVAIMay 23, 2024

Synergistic Global-space Camera and Human Reconstruction from Videos

arXiv:2405.14855v115 citationsh-index: 16CVPR
Originality Highly original
AI Analysis

This work addresses the challenge of integrating camera and human reconstruction for applications in robotics and AR/VR, representing a novel synergistic approach rather than an incremental improvement.

The paper tackles the problem of jointly reconstructing camera trajectories, human meshes, and dense scene point clouds from monocular videos, which were previously addressed independently, and achieves consistent reconstructions in a common world frame.

Remarkable strides have been made in reconstructing static scenes or human bodies from monocular videos. Yet, the two problems have largely been approached independently, without much synergy. Most visual SLAM methods can only reconstruct camera trajectories and scene structures up to scale, while most HMR methods reconstruct human meshes in metric scale but fall short in reasoning with cameras and scenes. This work introduces Synergistic Camera and Human Reconstruction (SynCHMR) to marry the best of both worlds. Specifically, we design Human-aware Metric SLAM to reconstruct metric-scale camera poses and scene point clouds using camera-frame HMR as a strong prior, addressing depth, scale, and dynamic ambiguities. Conditioning on the dense scene recovered, we further learn a Scene-aware SMPL Denoiser to enhance world-frame HMR by incorporating spatio-temporal coherency and dynamic scene constraints. Together, they lead to consistent reconstructions of camera trajectories, human meshes, and dense scene point clouds in a common world frame. Project page: https://paulchhuang.github.io/synchmr

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes