360DVO: Deep Visual Odometry for Monocular 360-Degree Camera
This work addresses robustness issues in visual odometry for 360-degree cameras, which is incremental as it applies deep learning to an existing domain-specific problem.
The paper tackles the problem of monocular omnidirectional visual odometry (OVO) by introducing 360DVO, a deep learning-based framework that improves robustness by 50% and accuracy by 37.5% over state-of-the-art methods in challenging scenarios like aggressive motion and varying illumination.
Monocular omnidirectional visual odometry (OVO) systems leverage 360-degree cameras to overcome field-of-view limitations of perspective VO systems. However, existing methods, reliant on handcrafted features or photometric objectives, often lack robustness in challenging scenarios, such as aggressive motion and varying illumination. To address this, we present 360DVO, the first deep learning-based OVO framework. Our approach introduces a distortion-aware spherical feature extractor (DAS-Feat) that adaptively learns distortion-resistant features from 360-degree images. These sparse feature patches are then used to establish constraints for effective pose estimation within a novel omnidirectional differentiable bundle adjustment (ODBA) module. To facilitate evaluation in realistic settings, we also contribute a new real-world OVO benchmark. Extensive experiments on this benchmark and public synthetic datasets (TartanAir V2 and 360VO) demonstrate that 360DVO surpasses state-of-the-art baselines (including 360VO and OpenVSLAM), improving robustness by 50% and accuracy by 37.5%. Homepage: https://chris1004336379.github.io/360DVO-homepage