CVSep 19, 2023

Cross-modal and Cross-domain Knowledge Transfer for Label-free 3D Segmentation

arXiv:2309.10649v21 citationsh-index: 33
Originality Incremental advance
AI Analysis

This addresses the annotation bottleneck for 3D perception tasks, offering a label-free solution that is incremental in leveraging existing image datasets.

The paper tackles the problem of expensive manual annotations for 3D point cloud segmentation by proposing a method to transfer knowledge from 2D images to 3D point clouds without using 3D labels, achieving state-of-the-art performance on SemanticKITTI.

Current state-of-the-art point cloud-based perception methods usually rely on large-scale labeled data, which requires expensive manual annotations. A natural option is to explore the unsupervised methodology for 3D perception tasks. However, such methods often face substantial performance-drop difficulties. Fortunately, we found that there exist amounts of image-based datasets and an alternative can be proposed, i.e., transferring the knowledge in the 2D images to 3D point clouds. Specifically, we propose a novel approach for the challenging cross-modal and cross-domain adaptation task by fully exploring the relationship between images and point clouds and designing effective feature alignment strategies. Without any 3D labels, our method achieves state-of-the-art performance for 3D point cloud semantic segmentation on SemanticKITTI by using the knowledge of KITTI360 and GTA5, compared to existing unsupervised and weakly-supervised baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes