CVGRJul 21, 2022

Neural Pixel Composition: 3D-4D View Synthesis from Multi-Views

CMU
arXiv:2207.10663v19 citationsh-index: 19
Originality Incremental advance
AI Analysis

This addresses the need for efficient and robust view synthesis in computer vision, with incremental improvements in speed and sparsity handling.

The paper tackles the problem of 3D-4D view synthesis from sparse multi-view observations, achieving results comparable to or better than existing methods while training 200-400 times faster for high-resolution content.

We present Neural Pixel Composition (NPC), a novel approach for continuous 3D-4D view synthesis given only a discrete set of multi-view observations as input. Existing state-of-the-art approaches require dense multi-view supervision and an extensive computational budget. The proposed formulation reliably operates on sparse and wide-baseline multi-view imagery and can be trained efficiently within a few seconds to 10 minutes for hi-res (12MP) content, i.e., 200-400X faster convergence than existing methods. Crucial to our approach are two core novelties: 1) a representation of a pixel that contains color and depth information accumulated from multi-views for a particular location and time along a line of sight, and 2) a multi-layer perceptron (MLP) that enables the composition of this rich information provided for a pixel location to obtain the final color output. We experiment with a large variety of multi-view sequences, compare to existing approaches, and achieve better results in diverse and challenging settings. Finally, our approach enables dense 3D reconstruction from sparse multi-views, where COLMAP, a state-of-the-art 3D reconstruction approach, struggles.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes