CVDec 9, 2025

On-the-fly Large-scale 3D Reconstruction from Multi-Camera Rigs

arXiv:2512.08498v11 citationsh-index: 6
Originality Highly original
AI Analysis

This addresses the problem of real-time, large-scale 3D scene reconstruction for applications like robotics and AR/VR, representing a significant advance rather than an incremental improvement.

The paper tackles the problem of incomplete 3D coverage in real-time reconstruction from monocular RGB streams by developing the first on-the-fly 3D reconstruction framework for multi-camera rigs, achieving reconstruction of hundreds of meters of 3D scenes within just 2 minutes using raw multi-camera video streams.

Recent advances in 3D Gaussian Splatting (3DGS) have enabled efficient free-viewpoint rendering and photorealistic scene reconstruction. While on-the-fly extensions of 3DGS have shown promise for real-time reconstruction from monocular RGB streams, they often fail to achieve complete 3D coverage due to the limited field of view (FOV). Employing a multi-camera rig fundamentally addresses this limitation. In this paper, we present the first on-the-fly 3D reconstruction framework for multi-camera rigs. Our method incrementally fuses dense RGB streams from multiple overlapping cameras into a unified Gaussian representation, achieving drift-free trajectory estimation and efficient online reconstruction. We propose a hierarchical camera initialization scheme that enables coarse inter-camera alignment without calibration, followed by a lightweight multi-camera bundle adjustment that stabilizes trajectories while maintaining real-time performance. Furthermore, we introduce a redundancy-free Gaussian sampling strategy and a frequency-aware optimization scheduler to reduce the number of Gaussian primitives and the required optimization iterations, thereby maintaining both efficiency and reconstruction fidelity. Our method reconstructs hundreds of meters of 3D scenes within just 2 minutes using only raw multi-camera video streams, demonstrating unprecedented speed, robustness, and Fidelity for on-the-fly 3D scene reconstruction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes