ROApr 17, 2021

Spherical Multi-Modal Place Recognition for Heterogeneous Sensor Systems

arXiv:2104.10067v118 citations
Originality Incremental advance
AI Analysis

This addresses the problem of robust place recognition for robotics and autonomous systems when using different sensors, though it is incremental as it builds on existing multi-modal and spherical projection methods.

The paper tackles place recognition with heterogeneous sensor systems by proposing a multi-modal pipeline that projects images and LiDAR scans onto a sphere and uses a spherical CNN for descriptors, achieving up to 10% higher recall than LiDAR-based and 5% higher than vision-based systems when sensor setups differ between training and deployment, and correctly identifying up to 95% of matches from candidate sets.

In this paper, we propose a robust end-to-end multi-modal pipeline for place recognition where the sensor systems can differ from the map building to the query. Our approach operates directly on images and LiDAR scans without requiring any local feature extraction modules. By projecting the sensor data onto the unit sphere, we learn a multi-modal descriptor of partially overlapping scenes using a spherical convolutional neural network. The employed spherical projection model enables the support of arbitrary LiDAR and camera systems readily without losing information. Loop closure candidates are found using a nearest-neighbor lookup in the embedding space. We tackle the problem of correctly identifying the closest place by correlating the candidates' power spectra, obtaining a confidence value per prospect. Our estimate for the correct place corresponds then to the candidate with the highest confidence. We evaluate our proposal w.r.t. state-of-the-art approaches in place recognition using real-world data acquired using different sensors. Our approach can achieve a recall that is up to 10% and 5% higher than for a LiDAR- and vision-based system, respectively, when the sensor setup differs between model training and deployment. Additionally, our place selection can correctly identify up to 95% matches from the candidate set.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes