CVSep 29, 2020

Multi-View Consistency Loss for Improved Single-Image 3D Reconstruction of Clothed People

arXiv:2009.14162v119 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of accurate 3D human shape estimation from monocular images for applications like virtual reality or animation, though it is incremental as it builds on existing volumetric and learning-based frameworks.

The paper tackles the problem of improving 3D reconstruction accuracy for clothed people from a single image by introducing a new synthetic dataset (3DVH) and a multi-view consistency loss, resulting in significant outperformance over previous state-of-the-art methods in accuracy, completeness, and quality.

We present a novel method to improve the accuracy of the 3D reconstruction of clothed human shape from a single image. Recent work has introduced volumetric, implicit and model-based shape learning frameworks for reconstruction of objects and people from one or more images. However, the accuracy and completeness for reconstruction of clothed people is limited due to the large variation in shape resulting from clothing, hair, body size, pose and camera viewpoint. This paper introduces two advances to overcome this limitation: firstly a new synthetic dataset of realistic clothed people, 3DVH; and secondly, a novel multiple-view loss function for training of monocular volumetric shape estimation, which is demonstrated to significantly improve generalisation and reconstruction accuracy. The 3DVH dataset of realistic clothed 3D human models rendered with diverse natural backgrounds is demonstrated to allows transfer to reconstruction from real images of people. Comprehensive comparative performance evaluation on both synthetic and real images of people demonstrates that the proposed method significantly outperforms the previous state-of-the-art learning-based single image 3D human shape estimation approaches achieving significant improvement of reconstruction accuracy, completeness, and quality. An ablation study shows that this is due to both the proposed multiple-view training and the new 3DVH dataset. The code and the dataset can be found at the project website: https://akincaliskan3d.github.io/MV3DH/.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes