CVAIROJan 11, 2025

Aug3D: Augmenting large scale outdoor datasets for Generalizable Novel View Synthesis

arXiv:2501.06431v11 citationsh-index: 4
Originality Incremental advance
AI Analysis

This work addresses the challenge of scaling photorealistic novel view synthesis from small indoor to large outdoor scenes, though it is incremental as it builds on existing models like PixelNeRF.

The authors tackled the problem of generalizable novel view synthesis for large-scale outdoor scenes by training a feed-forward model on the UrbanScene3D dataset and introducing Aug3D, an augmentation technique that improved performance, with a 10% PSNR gain from reducing views per cluster.

Recent photorealistic Novel View Synthesis (NVS) advances have increasingly gained attention. However, these approaches remain constrained to small indoor scenes. While optimization-based NVS models have attempted to address this, generalizable feed-forward methods, offering significant advantages, remain underexplored. In this work, we train PixelNeRF, a feed-forward NVS model, on the large-scale UrbanScene3D dataset. We propose four training strategies to cluster and train on this dataset, highlighting that performance is hindered by limited view overlap. To address this, we introduce Aug3D, an augmentation technique that leverages reconstructed scenes using traditional Structure-from-Motion (SfM). Aug3D generates well-conditioned novel views through grid and semantic sampling to enhance feed-forward NVS model learning. Our experiments reveal that reducing the number of views per cluster from 20 to 10 improves PSNR by 10%, but the performance remains suboptimal. Aug3D further addresses this by combining the newly generated novel views with the original dataset, demonstrating its effectiveness in improving the model's ability to predict novel views.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes