ROCVJan 23, 2024

SemanticSLAM: Learning based Semantic Map Construction and Robust Camera Localization

arXiv:2401.13076v14 citationsh-index: 2Has CodeSSCI
Originality Incremental advance
AI Analysis

This addresses the need for efficient and interpretable camera localization in robotics, particularly for indoor navigation tasks, though it appears incremental as it builds on existing VSLAM techniques with semantic enhancements.

The paper tackles the problem of high memory and computation overhead in Visual SLAM by introducing SemanticSLAM, an end-to-end visual-inertial odometry system that uses semantic features from an RGB-D sensor to construct a semantic map and localize the camera, improving pose estimation by 17% compared to existing methods.

Current techniques in Visual Simultaneous Localization and Mapping (VSLAM) estimate camera displacement by comparing image features of consecutive scenes. These algorithms depend on scene continuity, hence requires frequent camera inputs. However, processing images frequently can lead to significant memory usage and computation overhead. In this study, we introduce SemanticSLAM, an end-to-end visual-inertial odometry system that utilizes semantic features extracted from an RGB-D sensor. This approach enables the creation of a semantic map of the environment and ensures reliable camera localization. SemanticSLAM is scene-agnostic, which means it doesn't require retraining for different environments. It operates effectively in indoor settings, even with infrequent camera input, without prior knowledge. The strength of SemanticSLAM lies in its ability to gradually refine the semantic map and improve pose estimation. This is achieved by a convolutional long-short-term-memory (ConvLSTM) network, trained to correct errors during map construction. Compared to existing VSLAM algorithms, SemanticSLAM improves pose estimation by 17%. The resulting semantic map provides interpretable information about the environment and can be easily applied to various downstream tasks, such as path planning, obstacle avoidance, and robot navigation. The code will be publicly available at https://github.com/Leomingyangli/SemanticSLAM

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes