S-NeRF: Neural Radiance Fields for Street Views
This work addresses the problem of generating realistic street views for applications such as self-driving cars, representing an incremental improvement by adapting NeRFs to handle specific challenges in this domain.
The paper tackles the problem of synthesizing novel views from street-view data, which is challenging for existing Neural Radiance Fields (NeRFs) due to large-scale unbounded scenes and non-overlapping camera views, resulting in artifacts like blurs and 'floaters'. The proposed S-NeRF method improves scene parameterization, uses LiDAR data to enhance training, and extends to reconstruct moving vehicles, achieving reductions of 7% to 40% in mean-squared error for street-view synthesis and a 45% PSNR gain for moving vehicle rendering on datasets like nuScenes and Waymo.
Neural Radiance Fields (NeRFs) aim to synthesize novel views of objects and scenes, given the object-centric camera views with large overlaps. However, we conjugate that this paradigm does not fit the nature of the street views that are collected by many self-driving cars from the large-scale unbounded scenes. Also, the onboard cameras perceive scenes without much overlapping. Thus, existing NeRFs often produce blurs, 'floaters' and other artifacts on street-view synthesis. In this paper, we propose a new street-view NeRF (S-NeRF) that considers novel view synthesis of both the large-scale background scenes and the foreground moving vehicles jointly. Specifically, we improve the scene parameterization function and the camera poses for learning better neural representations from street views. We also use the the noisy and sparse LiDAR points to boost the training and learn a robust geometry and reprojection based confidence to address the depth outliers. Moreover, we extend our S-NeRF for reconstructing moving vehicles that is impracticable for conventional NeRFs. Thorough experiments on the large-scale driving datasets (e.g., nuScenes and Waymo) demonstrate that our method beats the state-of-the-art rivals by reducing 7% to 40% of the mean-squared error in the street-view synthesis and a 45% PSNR gain for the moving vehicles rendering.