Unifying Correspondence, Pose and NeRF for Pose-Free Novel View Synthesis from Stereo Pairs
This addresses the challenging problem of generating novel views without known camera poses for applications in 3D vision, representing a novel integration rather than an incremental advance.
The paper tackles pose-free novel view synthesis from stereo pairs by developing a unified framework that integrates correspondence matching, pose estimation, and NeRF rendering, achieving substantial improvement over previous methods in scenarios with extreme viewpoint changes and no accurate camera poses.
This work delves into the task of pose-free novel view synthesis from stereo pairs, a challenging and pioneering task in 3D vision. Our innovative framework, unlike any before, seamlessly integrates 2D correspondence matching, camera pose estimation, and NeRF rendering, fostering a synergistic enhancement of these tasks. We achieve this through designing an architecture that utilizes a shared representation, which serves as a foundation for enhanced 3D geometry understanding. Capitalizing on the inherent interplay between the tasks, our unified framework is trained end-to-end with the proposed training strategy to improve overall model accuracy. Through extensive evaluations across diverse indoor and outdoor scenes from two real-world datasets, we demonstrate that our approach achieves substantial improvement over previous methodologies, especially in scenarios characterized by extreme viewpoint changes and the absence of accurate camera poses.