DH-PTAM: A Deep Hybrid Stereo Events-Frames Parallel Tracking And Mapping System
This work addresses the problem of reliable SLAM in adverse conditions for robotics and autonomous systems, representing an incremental improvement through hybrid sensor fusion.
The paper tackles robust visual parallel tracking and mapping in challenging environments by combining stereo event-based and frame-based sensors with deep learning features, achieving superior robustness and accuracy in adverse conditions, especially in large-scale HDR scenarios, as demonstrated on VECtor and TUM-VIE benchmarks.
This paper presents a robust approach for a visual parallel tracking and mapping (PTAM) system that excels in challenging environments. Our proposed method combines the strengths of heterogeneous multi-modal visual sensors, including stereo event-based and frame-based sensors, in a unified reference frame through a novel spatio-temporal synchronization of stereo visual frames and stereo event streams. We employ deep learning-based feature extraction and description for estimation to enhance robustness further. We also introduce an end-to-end parallel tracking and mapping optimization layer complemented by a simple loop-closure algorithm for efficient SLAM behavior. Through comprehensive experiments on both small-scale and large-scale real-world sequences of VECtor and TUM-VIE benchmarks, our proposed method (DH-PTAM) demonstrates superior performance in terms of robustness and accuracy in adverse conditions, especially in large-scale HDR scenarios. Our implementation's research-based Python API is publicly available on GitHub for further research and development: https://github.com/AbanobSoliman/DH-PTAM.