CVMar 20

OrbitNVS: Harnessing Video Diffusion Priors for Novel View Synthesis

arXiv:2603.1961367.8h-index: 3
AI Analysis

This addresses the challenge of synthesizing plausible and consistent novel views for 3D objects, which is important for applications in computer vision and graphics, though it is an incremental advance by adapting existing video models.

The paper tackles the problem of novel view synthesis (NVS) from limited views, particularly under single-view input, by proposing OrbitNVS, which reformulates NVS as an orbit video generation task and leverages pre-trained video diffusion priors, achieving significant improvements such as +2.9 dB and +2.4 dB PSNR on benchmarks.

Novel View Synthesis (NVS) aims to generate unseen views of a 3D object given a limited number of known views. Existing methods often struggle to synthesize plausible views for unobserved regions, particularly under single-view input, and still face challenges in maintaining geometry- and appearance-consistency. To address these issues, we propose OrbitNVS, which reformulates NVS as an orbit video generation task. Through tailored model design and training strategies, we adapt a pre-trained video generation model to the NVS task, leveraging its rich visual priors to achieve high-quality view synthesis. Specifically, we incorporate camera adapters into the video model to enable accurate camera control. To enhance two key properties of 3D objects, geometry and appearance, we design a normal map generation branch and use normal map features to guide the synthesis of the target views via attention mechanism, thereby improving geometric consistency. Moreover, we apply a pixel-space supervision to alleviate blurry appearance caused by spatial compression in the latent space. Extensive experiments show that OrbitNVS significantly outperforms previous methods on the GSO and OmniObject3D benchmarks, especially in the challenging single-view setting (\eg, +2.9 dB and +2.4 dB PSNR).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes