TrackNeRF: Bundle Adjusting NeRF from Sparse and Noisy Views via Feature Tracks
This addresses a key limitation in NeRF applications for 3D reconstruction and novel view synthesis in scenarios with limited or imperfect data, representing a strong incremental advance over prior work.
The paper tackles the problem of learning Neural Radiance Fields (NeRFs) from sparse and noisy camera views, which is common in realistic setups, by introducing TrackNeRF, a method that uses feature tracks for globally consistent geometry reconstruction and pose optimization, resulting in significant improvements over state-of-the-art methods, such as ~8 and ~1 PSNR gains on DTU datasets.
Neural radiance fields (NeRFs) generally require many images with accurate poses for accurate novel view synthesis, which does not reflect realistic setups where views can be sparse and poses can be noisy. Previous solutions for learning NeRFs with sparse views and noisy poses only consider local geometry consistency with pairs of views. Closely following \textit{bundle adjustment} in Structure-from-Motion (SfM), we introduce TrackNeRF for more globally consistent geometry reconstruction and more accurate pose optimization. TrackNeRF introduces \textit{feature tracks}, \ie connected pixel trajectories across \textit{all} visible views that correspond to the \textit{same} 3D points. By enforcing reprojection consistency among feature tracks, TrackNeRF encourages holistic 3D consistency explicitly. Through extensive experiments, TrackNeRF sets a new benchmark in noisy and sparse view reconstruction. In particular, TrackNeRF shows significant improvements over the state-of-the-art BARF and SPARF by $\sim8$ and $\sim1$ in terms of PSNR on DTU under various sparse and noisy view setups. The code is available at \href{https://tracknerf.github.io/}.