3D-PointZshotS: Geometry-Aware 3D Point Cloud Zero-Shot Semantic Segmentation Narrowing the Visual-Semantic Gap
This work addresses the challenge of zero-shot semantic segmentation for 3D point clouds, which is crucial for applications like robotics and autonomous driving, but it appears incremental as it builds on existing methods with specific enhancements.
The paper tackles the problem of limited transferability in zero-shot 3D point cloud segmentation by introducing 3D-PointZshotS, a geometry-aware framework that uses latent geometric prototypes and a self-consistency loss to enhance feature generation and alignment, achieving superior performance in harmonic mIoU on three real-world datasets.
Existing zero-shot 3D point cloud segmentation methods often struggle with limited transferability from seen classes to unseen classes and from semantic to visual space. To alleviate this, we introduce 3D-PointZshotS, a geometry-aware zero-shot segmentation framework that enhances both feature generation and alignment using latent geometric prototypes (LGPs). Specifically, we integrate LGPs into a generator via a cross-attention mechanism, enriching semantic features with fine-grained geometric details. To further enhance stability and generalization, we introduce a self-consistency loss, which enforces feature robustness against point-wise perturbations. Additionally, we re-represent visual and semantic features in a shared space, bridging the semantic-visual gap and facilitating knowledge transfer to unseen classes. Experiments on three real-world datasets, namely ScanNet, SemanticKITTI, and S3DIS, demonstrate that our method achieves superior performance over four baselines in terms of harmonic mIoU. The code is available at \href{https://github.com/LexieYang/3D-PointZshotS}{Github}.