CVMar 15, 2024

SemanticHuman-HD: High-Resolution Semantic Disentangled 3D Human Generation

arXiv:2403.10166v13 citationsh-index: 3ECCV
Originality Highly original
AI Analysis

It addresses the lack of semantic control and resolution limitations in 3D human generation for applications like garment design and controllable synthesis, representing a novel advancement.

The paper tackles the problem of generating high-resolution 3D human images with semantic disentanglement, achieving the first method for semantic disentangled human image synthesis and 3D-aware synthesis at 1024^2 resolution, with reduced computational cost through a proposed super-resolution module.

With the development of neural radiance fields and generative models, numerous methods have been proposed for learning 3D human generation from 2D images. These methods allow control over the pose of the generated 3D human and enable rendering from different viewpoints. However, none of these methods explore semantic disentanglement in human image synthesis, i.e., they can not disentangle the generation of different semantic parts, such as the body, tops, and bottoms. Furthermore, existing methods are limited to synthesize images at $512^2$ resolution due to the high computational cost of neural radiance fields. To address these limitations, we introduce SemanticHuman-HD, the first method to achieve semantic disentangled human image synthesis. Notably, SemanticHuman-HD is also the first method to achieve 3D-aware image synthesis at $1024^2$ resolution, benefiting from our proposed 3D-aware super-resolution module. By leveraging the depth maps and semantic masks as guidance for the 3D-aware super-resolution, we significantly reduce the number of sampling points during volume rendering, thereby reducing the computational cost. Our comparative experiments demonstrate the superiority of our method. The effectiveness of each proposed component is also verified through ablation studies. Moreover, our method opens up exciting possibilities for various applications, including 3D garment generation, semantic-aware image synthesis, controllable image synthesis, and out-of-domain image synthesis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes