CVAug 21, 2024

GSTran: Joint Geometric and Semantic Coherence for Point Cloud Segmentation

arXiv:2408.11558v11 citationsh-index: 13Has Code
Originality Incremental advance
AI Analysis

This work addresses point cloud segmentation for computer vision applications, presenting an incremental improvement with a novel hybrid method.

The paper tackles the challenge of learning meaningful local and global information in point cloud segmentation by proposing GSTran, a transformer network that addresses indiscriminate neighbor aggregation and inaccurate long-distance dependencies, resulting in demonstrated superiority over other algorithms on ShapeNetPart and S3DIS benchmarks.

Learning meaningful local and global information remains a challenge in point cloud segmentation tasks. When utilizing local information, prior studies indiscriminately aggregates neighbor information from different classes to update query points, potentially compromising the distinctive feature of query points. In parallel, inaccurate modeling of long-distance contextual dependencies when utilizing global information can also impact model performance. To address these issues, we propose GSTran, a novel transformer network tailored for the segmentation task. The proposed network mainly consists of two principal components: a local geometric transformer and a global semantic transformer. In the local geometric transformer module, we explicitly calculate the geometric disparity within the local region. This enables amplifying the affinity with geometrically similar neighbor points while suppressing the association with other neighbors. In the global semantic transformer module, we design a multi-head voting strategy. This strategy evaluates semantic similarity across the entire spatial range, facilitating the precise capture of contextual dependencies. Experiments on ShapeNetPart and S3DIS benchmarks demonstrate the effectiveness of the proposed method, showing its superiority over other algorithms. The code is available at https://github.com/LAB123-tech/GSTran.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes