MonoNHR: Monocular Neural Human Renderer
This addresses the problem of robust human rendering from monocular images for applications in computer vision and graphics, representing a novel advancement in the field.
The paper tackles the problem of rendering free-viewpoint images of humans from a single input image, which is challenging due to missing information in occluded areas and depth ambiguity. It introduces MonoNHR, a method that achieves this without geometry supervision and outperforms recent adapted methods on datasets like ZJU-MoCap, AIST, and HUMBI.
Existing neural human rendering methods struggle with a single image input due to the lack of information in invisible areas and the depth ambiguity of pixels in visible areas. In this regard, we propose Monocular Neural Human Renderer (MonoNHR), a novel approach that renders robust free-viewpoint images of an arbitrary human given only a single image. MonoNHR is the first method that (i) renders human subjects never seen during training in a monocular setup, and (ii) is trained in a weakly-supervised manner without geometry supervision. First, we propose to disentangle 3D geometry and texture features and to condition the texture inference on the 3D geometry features. Second, we introduce a Mesh Inpainter module that inpaints the occluded parts exploiting human structural priors such as symmetry. Experiments on ZJU-MoCap, AIST, and HUMBI datasets show that our approach significantly outperforms the recent methods adapted to the monocular case.