CNN-Augmented Visual-Inertial SLAM with Planar Constraints
This work addresses robustness in SLAM for robotics or AR/VR applications, but it is incremental as it builds on existing methods like ORB-SLAM3.
The paper tackles the problem of improving visual-inertial SLAM by integrating CNN depth predictions with planar constraints, resulting in enhanced performance over ORB-SLAM3 on the EuRoC dataset.
We present a robust visual-inertial SLAM system that combines the benefits of Convolutional Neural Networks (CNNs) and planar constraints. Our system leverages a CNN to predict the depth map and the corresponding uncertainty map for each image. The CNN depth effectively bootstraps the back-end optimization of SLAM and meanwhile the CNN uncertainty adaptively weighs the contribution of each feature point to the back-end optimization. Given the gravity direction from the inertial sensor, we further present a fast plane detection method that detects horizontal planes via one-point RANSAC and vertical planes via two-point RANSAC. Those stably detected planes are in turn used to regularize the back-end optimization of SLAM. We evaluate our system on a public dataset, \ie, EuRoC, and demonstrate improved results over a state-of-the-art SLAM system, \ie, ORB-SLAM3.