CVAIJun 25, 2024

Point-SAM: Promptable 3D Segmentation Model for Point Clouds

arXiv:2406.17741v260 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the problem of 3D segmentation for applications like interactive annotation and zero-shot proposals, representing an incremental advancement by adapting 2D methods to 3D.

The paper tackles the challenge of developing a 3D foundation model for segmentation by proposing Point-SAM, which extends the 2D Segment Anything Model to point clouds using knowledge distillation and a transformer-based architecture, achieving state-of-the-art performance on indoor and outdoor benchmarks.

The development of 2D foundation models for image segmentation has been significantly advanced by the Segment Anything Model (SAM). However, achieving similar success in 3D models remains a challenge due to issues such as non-unified data formats, poor model scalability, and the scarcity of labeled data with diverse masks. To this end, we propose a 3D promptable segmentation model Point-SAM, focusing on point clouds. We employ an efficient transformer-based architecture tailored for point clouds, extending SAM to the 3D domain. We then distill the rich knowledge from 2D SAM for Point-SAM training by introducing a data engine to generate part-level and object-level pseudo-labels at scale from 2D SAM. Our model outperforms state-of-the-art 3D segmentation models on several indoor and outdoor benchmarks and demonstrates a variety of applications, such as interactive 3D annotation and zero-shot 3D instance proposal. Codes and demo can be found at https://github.com/zyc00/Point-SAM.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes