CVAug 28, 2023

Compositional Semantic Mix for Domain Adaptation in Point Cloud Segmentation

arXiv:2308.14619v232 citationsh-index: 55
Originality Highly original
AI Analysis

This addresses the limited generalization of deep-learning models for point cloud segmentation across different sensors or environments, offering a novel approach for domain adaptation in this domain.

The paper tackles the problem of domain shift in 3D point cloud semantic segmentation by introducing compositional semantic mixing, an unsupervised domain adaptation technique that uses semantic and geometric sample mixing. The method significantly outperforms state-of-the-art methods in synthetic-to-real and real-to-real scenarios, as demonstrated on LiDAR datasets.

Deep-learning models for 3D point cloud semantic segmentation exhibit limited generalization capabilities when trained and tested on data captured with different sensors or in varying environments due to domain shift. Domain adaptation methods can be employed to mitigate this domain shift, for instance, by simulating sensor noise, developing domain-agnostic generators, or training point cloud completion networks. Often, these methods are tailored for range view maps or necessitate multi-modal input. In contrast, domain adaptation in the image domain can be executed through sample mixing, which emphasizes input data manipulation rather than employing distinct adaptation modules. In this study, we introduce compositional semantic mixing for point cloud domain adaptation, representing the first unsupervised domain adaptation technique for point cloud segmentation based on semantic and geometric sample mixing. We present a two-branch symmetric network architecture capable of concurrently processing point clouds from a source domain (e.g. synthetic) and point clouds from a target domain (e.g. real-world). Each branch operates within one domain by integrating selected data fragments from the other domain and utilizing semantic information derived from source labels and target (pseudo) labels. Additionally, our method can leverage a limited number of human point-level annotations (semi-supervised) to further enhance performance. We assess our approach in both synthetic-to-real and real-to-real scenarios using LiDAR datasets and demonstrate that it significantly outperforms state-of-the-art methods in both unsupervised and semi-supervised settings.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes