CVGRDec 19, 2024

LiftRefine: Progressively Refined View Synthesis from 3D Lifting with Volume-Triplane Representations

arXiv:2412.14464v1h-index: 27
Originality Incremental advance
AI Analysis

This work addresses the ill-posed challenge of image-to-3D generation for applications in computer vision and graphics, representing an incremental improvement over existing view synthesis methods.

The paper tackles the problem of synthesizing novel views from single or few-view images by proposing a two-stage method that combines a reconstruction model with a diffusion model and progressive refinement, achieving state-of-the-art results on datasets like SRN-Car, CO3D, and Objaverse with improved sampling efficacy and multi-view consistency.

We propose a new view synthesis method via synthesizing a 3D neural field from both single or few-view input images. To address the ill-posed nature of the image-to-3D generation problem, we devise a two-stage method that involves a reconstruction model and a diffusion model for view synthesis. Our reconstruction model first lifts one or more input images to the 3D space from a volume as the coarse-scale 3D representation followed by a tri-plane as the fine-scale 3D representation. To mitigate the ambiguity in occluded regions, our diffusion model then hallucinates missing details in the rendered images from tri-planes. We then introduce a new progressive refinement technique that iteratively applies the reconstruction and diffusion model to gradually synthesize novel views, boosting the overall quality of the 3D representations and their rendering. Empirical evaluation demonstrates the superiority of our method over state-of-the-art methods on the synthetic SRN-Car dataset, the in-the-wild CO3D dataset, and large-scale Objaverse dataset while achieving both sampling efficacy and multi-view consistency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes