CVSep 3, 2024

When 3D Partial Points Meets SAM: Tooth Point Cloud Segmentation with Sparse Labels

Yifan Liu, Wuyang Li, Cheng Wang, Hui Chen, Yixuan Yuan

arXiv:2409.01691v17.67 citationsh-index: 18Has Code

Originality Incremental advance

AI Analysis

This addresses the problem of expensive manual annotation in orthodontic applications by enabling effective segmentation with minimal labels, though it is incremental as it adapts existing SAM capabilities to a new domain.

The paper tackles tooth point cloud segmentation with extremely sparse labels, proposing SAMTooth, a framework that leverages the Segment Anything Model (SAM) to complement sparse supervision, achieving performance comparable to fully-supervised methods with only 0.1% annotations.

Tooth point cloud segmentation is a fundamental task in many orthodontic applications. Current research mainly focuses on fully supervised learning which demands expensive and tedious manual point-wise annotation. Although recent weakly-supervised alternatives are proposed to use weak labels for 3D segmentation and achieve promising results, they tend to fail when the labels are extremely sparse. Inspired by the powerful promptable segmentation capability of the Segment Anything Model (SAM), we propose a framework named SAMTooth that leverages such capacity to complement the extremely sparse supervision. To automatically generate appropriate point prompts for SAM, we propose a novel Confidence-aware Prompt Generation strategy, where coarse category predictions are aggregated with confidence-aware filtering. Furthermore, to fully exploit the structural and shape clues in SAM's outputs for assisting the 3D feature learning, we advance a Mask-guided Representation Learning that re-projects the generated tooth masks of SAM into 3D space and constrains these points of different teeth to possess distinguished representations. To demonstrate the effectiveness of the framework, we conduct experiments on the public dataset and surprisingly find with only 0.1\% annotations (one point per tooth), our method can surpass recent weakly supervised methods by a large margin, and the performance is even comparable to the recent fully-supervised methods, showcasing the significant potential of applying SAM to 3D perception tasks with sparse labels. Code is available at https://github.com/CUHK-AIM-Group/SAMTooth.

View on arXiv PDF Code

Similar