CVAug 27, 2024

Drone-assisted Road Gaussian Splatting with Cross-view Uncertainty

Tsinghua
arXiv:2408.15242v119 citationsh-index: 6
Originality Incremental advance
AI Analysis

This work addresses the need for more realistic simulations in autonomous driving by incrementally improving 3D Gaussian Splatting with multi-view data.

The paper tackles the problem of improving novel view synthesis for large-scale road scenes in autonomous driving simulation by using drone and ground vehicle imagery, achieving enhanced rendering fidelity through an uncertainty-aware training method that weights pixels based on cross-view uncertainty.

Robust and realistic rendering for large-scale road scenes is essential in autonomous driving simulation. Recently, 3D Gaussian Splatting (3D-GS) has made groundbreaking progress in neural rendering, but the general fidelity of large-scale road scene renderings is often limited by the input imagery, which usually has a narrow field of view and focuses mainly on the street-level local area. Intuitively, the data from the drone's perspective can provide a complementary viewpoint for the data from the ground vehicle's perspective, enhancing the completeness of scene reconstruction and rendering. However, training naively with aerial and ground images, which exhibit large view disparity, poses a significant convergence challenge for 3D-GS, and does not demonstrate remarkable improvements in performance on road views. In order to enhance the novel view synthesis of road views and to effectively use the aerial information, we design an uncertainty-aware training method that allows aerial images to assist in the synthesis of areas where ground images have poor learning outcomes instead of weighting all pixels equally in 3D-GS training like prior work did. We are the first to introduce the cross-view uncertainty to 3D-GS by matching the car-view ensemble-based rendering uncertainty to aerial images, weighting the contribution of each pixel to the training process. Additionally, to systematically quantify evaluation metrics, we assemble a high-quality synthesized dataset comprising both aerial and ground images for road scenes.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes