CVNov 12, 2024

CameraHMR: Aligning People with Perspective

arXiv:2411.08128v153 citationsh-index: 53DV
Originality Incremental advance
AI Analysis

This work addresses the problem of 3D human pose and shape estimation for computer vision applications, representing an incremental improvement by enhancing training data quality.

The paper tackles the challenge of accurate 3D human pose and shape estimation from monocular images by improving pseudo ground truth accuracy through camera intrinsics estimation and dense surface keypoint detection, resulting in a new model, CameraHMR, that achieves state-of-the-art accuracy.

We address the challenge of accurate 3D human pose and shape estimation from monocular images. The key to accuracy and robustness lies in high-quality training data. Existing training datasets containing real images with pseudo ground truth (pGT) use SMPLify to fit SMPL to sparse 2D joint locations, assuming a simplified camera with default intrinsics. We make two contributions that improve pGT accuracy. First, to estimate camera intrinsics, we develop a field-of-view prediction model (HumanFoV) trained on a dataset of images containing people. We use the estimated intrinsics to enhance the 4D-Humans dataset by incorporating a full perspective camera model during SMPLify fitting. Second, 2D joints provide limited constraints on 3D body shape, resulting in average-looking bodies. To address this, we use the BEDLAM dataset to train a dense surface keypoint detector. We apply this detector to the 4D-Humans dataset and modify SMPLify to fit the detected keypoints, resulting in significantly more realistic body shapes. Finally, we upgrade the HMR2.0 architecture to include the estimated camera parameters. We iterate model training and SMPLify fitting initialized with the previously trained model. This leads to more accurate pGT and a new model, CameraHMR, with state-of-the-art accuracy. Code and pGT are available for research purposes.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes