CVJun 15, 2022

GRAM-HD: 3D-Consistent Image Generation at High Resolution with Generative Radiance Manifolds

arXiv:2206.07255v296 citationsh-index: 59
Originality Incremental advance
AI Analysis

This work addresses the challenge of high-resolution 3D-aware image generation for applications in computer graphics and vision, representing a significant step towards bridging 2D and 3D generation, though it builds incrementally on prior generative radiance manifold approaches.

The paper tackles the problem of generating high-resolution 3D-consistent images from single image collections, achieving results up to 1024x1024 resolution while maintaining strict 3D consistency, significantly outperforming existing methods on datasets like FFHQ and AFHQv2.

Recent works have shown that 3D-aware GANs trained on unstructured single image collections can generate multiview images of novel instances. The key underpinnings to achieve this are a 3D radiance field generator and a volume rendering process. However, existing methods either cannot generate high-resolution images (e.g., up to 256X256) due to the high computation cost of neural volume rendering, or rely on 2D CNNs for image-space upsampling which jeopardizes the 3D consistency across different views. This paper proposes a novel 3D-aware GAN that can generate high resolution images (up to 1024X1024) while keeping strict 3D consistency as in volume rendering. Our motivation is to achieve super-resolution directly in the 3D space to preserve 3D consistency. We avoid the otherwise prohibitively-expensive computation cost by applying 2D convolutions on a set of 2D radiance manifolds defined in the recent generative radiance manifold (GRAM) approach, and apply dedicated loss functions for effective GAN training at high resolution. Experiments on FFHQ and AFHQv2 datasets show that our method can produce high-quality 3D-consistent results that significantly outperform existing methods. It makes a significant step towards closing the gap between traditional 2D image generation and 3D-consistent free-view generation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes