CVApr 13, 2018

BodyNet: Volumetric Inference of 3D Human Body Shapes

arXiv:1804.04875v3455 citations
Originality Incremental advance
AI Analysis

This addresses shape estimation for applications like video editing and animation, but it is incremental as it builds on prior parametric models with a new representation.

The paper tackles the problem of predicting 3D human body shapes from single images, which is challenging due to variations in bodies, clothing, and viewpoints, and achieves state-of-the-art results on SURREAL and Unite the People datasets by outperforming recent approaches.

Human shape estimation is an important task for video editing, animation and fashion industry. Predicting 3D human body shape from natural images, however, is highly challenging due to factors such as variation in human bodies, clothing and viewpoint. Prior methods addressing this problem typically attempt to fit parametric body models with certain priors on pose and shape. In this work we argue for an alternative representation and propose BodyNet, a neural network for direct inference of volumetric body shape from a single image. BodyNet is an end-to-end trainable network that benefits from (i) a volumetric 3D loss, (ii) a multi-view re-projection loss, and (iii) intermediate supervision of 2D pose, 2D body part segmentation, and 3D pose. Each of them results in performance improvement as demonstrated by our experiments. To evaluate the method, we fit the SMPL model to our network output and show state-of-the-art results on the SURREAL and Unite the People datasets, outperforming recent approaches. Besides achieving state-of-the-art performance, our method also enables volumetric body-part segmentation.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes