CVMar 20

LagerNVS: Latent Geometry for Fully Neural Real-time Novel View Synthesis

arXiv:2603.2017697.11 citationsh-index: 19
AI Analysis

This work addresses the challenge of efficient and accurate 3D view synthesis for applications in computer vision and graphics, representing an incremental improvement by integrating 3D inductive biases into neural networks.

The paper tackled the problem of novel view synthesis by introducing LagerNVS, a neural network that uses 3D-aware latent features to achieve state-of-the-art deterministic feed-forward performance, including a PSNR of 31.4 on Re10k, with real-time rendering and generalization to in-the-wild data.

Recent work has shown that neural networks can perform 3D tasks such as Novel View Synthesis (NVS) without explicit 3D reconstruction. Even so, we argue that strong 3D inductive biases are still helpful in the design of such networks. We show this point by introducing LagerNVS, an encoder-decoder neural network for NVS that builds on `3D-aware' latent features. The encoder is initialized from a 3D reconstruction network pre-trained using explicit 3D supervision. This is paired with a lightweight decoder, and trained end-to-end with photometric losses. LagerNVS achieves state-of-the-art deterministic feed-forward Novel View Synthesis (including 31.4 PSNR on Re10k), with and without known cameras, renders in real time, generalizes to in-the-wild data, and can be paired with a diffusion decoder for generative extrapolation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes