CVJun 24, 2022

Contrastive Learning of Features between Images and LiDAR

arXiv:2206.12071v18 citationsh-index: 35
Originality Incremental advance
AI Analysis

This addresses the challenge of sensor fusion for robots, but it is incremental as it builds on existing contrastive learning and network methods.

The paper tackles the problem of learning cross-modal features between images and LiDAR point clouds for robotics tasks like localization and mapping, proposing a Tuple-Circle loss function and modified network architectures, and demonstrates effectiveness on a real-world dataset with visualizations showing features from both modalities.

Image and Point Clouds provide different information for robots. Finding the correspondences between data from different sensors is crucial for various tasks such as localization, mapping, and navigation. Learning-based descriptors have been developed for single sensors; there is little work on cross-modal features. This work treats learning cross-modal features as a dense contrastive learning problem. We propose a Tuple-Circle loss function for cross-modality feature learning. Furthermore, to learn good features and not lose generality, we developed a variant of widely used PointNet++ architecture for point cloud and U-Net CNN architecture for images. Moreover, we conduct experiments on a real-world dataset to show the effectiveness of our loss function and network structure. We show that our models indeed learn information from both images as well as LiDAR by visualizing the features.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes