CVAIROMar 27, 2024

ModaLink: Unifying Modalities for Efficient Image-to-PointCloud Place Recognition

arXiv:2403.18762v19 citationsh-index: 11Has CodeIROS
Originality Highly original
AI Analysis

This addresses a challenging problem in robotics and autonomous vehicles for efficient localization without expensive labeled data.

The paper tackles cross-modal place recognition between images and point clouds by introducing a lightweight framework that eliminates computationally intensive depth estimation, achieving state-of-the-art performance on the KITTI dataset with real-time operation and demonstrating generalization on a 17 km trajectory.

Place recognition is an important task for robots and autonomous cars to localize themselves and close loops in pre-built maps. While single-modal sensor-based methods have shown satisfactory performance, cross-modal place recognition that retrieving images from a point-cloud database remains a challenging problem. Current cross-modal methods transform images into 3D points using depth estimation for modality conversion, which are usually computationally intensive and need expensive labeled data for depth supervision. In this work, we introduce a fast and lightweight framework to encode images and point clouds into place-distinctive descriptors. We propose an effective Field of View (FoV) transformation module to convert point clouds into an analogous modality as images. This module eliminates the necessity for depth estimation and helps subsequent modules achieve real-time performance. We further design a non-negative factorization-based encoder to extract mutually consistent semantic features between point clouds and images. This encoder yields more distinctive global descriptors for retrieval. Experimental results on the KITTI dataset show that our proposed methods achieve state-of-the-art performance while running in real time. Additional evaluation on the HAOMO dataset covering a 17 km trajectory further shows the practical generalization capabilities. We have released the implementation of our methods as open source at: https://github.com/haomo-ai/ModaLink.git.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes