CVJan 18, 2023

Contrastive Learning for Self-Supervised Pre-Training of Point Cloud Segmentation Networks With Image Data

arXiv:2301.07283v34 citationsh-index: 10
Originality Incremental advance
AI Analysis

This work addresses the challenge of limited labeled data for 3D semantic segmentation, particularly in scenarios where localization information is unavailable, though it is incremental as it builds on existing contrastive learning approaches.

The paper tackles the problem of reducing annotation costs for 3D point cloud segmentation by proposing a self-supervised pre-training method that uses image data to train a 3D model, achieving comparable performance to multi-scan, point cloud-only methods with only single scans.

Reducing the quantity of annotations required for supervised training is vital when labels are scarce and costly. This reduction is particularly important for semantic segmentation tasks involving 3D datasets, which are often significantly smaller and more challenging to annotate than their image-based counterparts. Self-supervised pre-training on unlabelled data is one way to reduce the amount of manual annotations needed. Previous work has focused on pre-training with point clouds exclusively. While useful, this approach often requires two or more registered views. In the present work, we combine image and point cloud modalities by first learning self-supervised image features and then using these features to train a 3D model. By incorporating image data, which is often included in many 3D datasets, our pre-training method only requires a single scan of a scene and can be applied to cases where localization information is unavailable. We demonstrate that our pre-training approach, despite using single scans, achieves comparable performance to other multi-scan, point cloud-only methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes