CVApr 13, 2022

Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis

arXiv:2204.06307v159 citationsh-index: 34
Originality Highly original
AI Analysis

This addresses the challenge of multi-view inconsistency in 3D-aware image generation for computer vision applications, representing a strong specific gain.

The paper tackles the problem of generating multi-view consistent images in 3D-aware image synthesis by proposing MVCGAN, which enforces geometry constraints through photometric consistency and stereo mixup, achieving state-of-the-art performance on three datasets.

3D-aware image synthesis aims to generate images of objects from multiple views by learning a 3D representation. However, one key challenge remains: existing approaches lack geometry constraints, hence usually fail to generate multi-view consistent images. To address this challenge, we propose Multi-View Consistent Generative Adversarial Networks (MVCGAN) for high-quality 3D-aware image synthesis with geometry constraints. By leveraging the underlying 3D geometry information of generated images, i.e., depth and camera transformation matrix, we explicitly establish stereo correspondence between views to perform multi-view joint optimization. In particular, we enforce the photometric consistency between pairs of views and integrate a stereo mixup mechanism into the training process, encouraging the model to reason about the correct 3D shape. Besides, we design a two-stage training strategy with feature-level multi-view joint optimization to improve the image quality. Extensive experiments on three datasets demonstrate that MVCGAN achieves the state-of-the-art performance for 3D-aware image synthesis.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes