CVApr 24, 2023

Unsupervised Style-based Explicit 3D Face Reconstruction from Single Image

arXiv:2304.12455v12 citationsh-index: 29
Originality Incremental advance
AI Analysis

This work addresses the challenge of generating manipulable 3D facial models without supervision, which is incremental as it combines existing architectures for improved performance in computer vision applications.

The paper tackles the ill-posed problem of 3D face reconstruction from a single image by proposing an unsupervised adversarial learning framework that merges explicit 3D reconstruction with style transfer, outperforming baselines like DepthNet and Pix2NeRF across three facial datasets.

Inferring 3D object structures from a single image is an ill-posed task due to depth ambiguity and occlusion. Typical resolutions in the literature include leveraging 2D or 3D ground truth for supervised learning, as well as imposing hand-crafted symmetry priors or using an implicit representation to hallucinate novel viewpoints for unsupervised methods. In this work, we propose a general adversarial learning framework for solving Unsupervised 2D to Explicit 3D Style Transfer (UE3DST). Specifically, we merge two architectures: the unsupervised explicit 3D reconstruction network of Wu et al.\ and the Generative Adversarial Network (GAN) named StarGAN-v2. We experiment across three facial datasets (Basel Face Model, 3DFAW and CelebA-HQ) and show that our solution is able to outperform well established solutions such as DepthNet in 3D reconstruction and Pix2NeRF in conditional style transfer, while we also justify the individual contributions of our model components via ablation. In contrast to the aforementioned baselines, our scheme produces features for explicit 3D rendering, which can be manipulated and utilized in downstream tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes