3D-Aware Semantic-Guided Generative Model for Human Synthesis
This work addresses a key challenge in computer graphics for applications requiring realistic human image synthesis, though it appears incremental as it builds on existing GNeRF methods.
The paper tackles the problem of generating high-quality images of non-rigid human bodies, which existing Generative Neural Radiance Field models struggle with, by proposing a 3D-aware Semantic-Guided Generative Model that combines a GNeRF with a texture generator, achieving significant performance improvements over recent baselines on the DeepFashion dataset.
Generative Neural Radiance Field (GNeRF) models, which extract implicit 3D representations from 2D images, have recently been shown to produce realistic images representing rigid/semi-rigid objects, such as human faces or cars. However, they usually struggle to generate high-quality images representing non-rigid objects, such as the human body, which is of a great interest for many computer graphics applications. This paper proposes a 3D-aware Semantic-Guided Generative Model (3D-SGAN) for human image synthesis, which combines a GNeRF with a texture generator. The former learns an implicit 3D representation of the human body and outputs a set of 2D semantic segmentation masks. The latter transforms these semantic masks into a real image, adding a realistic texture to the human appearance. Without requiring additional 3D information, our model can learn 3D human representations with a photo-realistic, controllable generation. Our experiments on the DeepFashion dataset show that 3D-SGAN significantly outperforms the most recent baselines. The code is available at https://github.com/zhangqianhui/3DSGAN