CVJul 19, 2024

Scale Disparity of Instances in Interactive Point Cloud Segmentation

arXiv:2407.14009v13 citationsh-index: 12
Originality Incremental advance
AI Analysis

This addresses the challenge of scale disparity in interactive segmentation for 3D scene understanding, offering a more versatile tool for users, though it is incremental as it builds on existing interactive methods.

The paper tackles the problem of interactive point cloud segmentation for instances of both thing and stuff categories with varying scales, proposing ClickFormer which outperforms existing methods in accuracy with fewer clicks across indoor and outdoor datasets.

Interactive point cloud segmentation has become a pivotal task for understanding 3D scenes, enabling users to guide segmentation models with simple interactions such as clicks, therefore significantly reducing the effort required to tailor models to diverse scenarios and new categories. However, in the realm of interactive segmentation, the meaning of instance diverges from that in instance segmentation, because users might desire to segment instances of both thing and stuff categories that vary greatly in scale. Existing methods have focused on thing categories, neglecting the segmentation of stuff categories and the difficulties arising from scale disparity. To bridge this gap, we propose ClickFormer, an innovative interactive point cloud segmentation model that accurately segments instances of both thing and stuff categories. We propose a query augmentation module to augment click queries by a global query sampling strategy, thus maintaining consistent performance across different instance scales. Additionally, we employ global attention in the query-voxel transformer to mitigate the risk of generating false positives, along with several other network structure improvements to further enhance the model's segmentation performance. Experiments demonstrate that ClickFormer outperforms existing interactive point cloud segmentation methods across both indoor and outdoor datasets, providing more accurate segmentation results with fewer user clicks in an open-world setting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes