CVMar 14, 2023

GeoSpark: Sparking up Point Cloud Segmentation with Geometry Clue

arXiv:2303.08274v11 citationsh-index: 57
Originality Incremental advance
AI Analysis

This addresses segmentation accuracy issues for 3D vision applications, offering an incremental improvement through a plug-in module.

The paper tackles the problem of limited long-range feature modeling and data-agnostic downsampling in point cloud segmentation by proposing GeoSpark, a plug-in module that uses geometry clues to guide feature learning and downsampling, resulting in a 74.7% mIoU on ScanNetv2 (4.1% improvement) and 71.5% mIoU on S3DIS Area 5 (1.1% improvement).

Current point cloud segmentation architectures suffer from limited long-range feature modeling, as they mostly rely on aggregating information with local neighborhoods. Furthermore, in order to learn point features at multiple scales, most methods utilize a data-agnostic sampling approach to decrease the number of points after each stage. Such sampling methods, however, often discard points for small objects in the early stages, leading to inadequate feature learning. We believe these issues are can be mitigated by introducing explicit geometry clues as guidance. To this end, we propose GeoSpark, a Plug-in module that incorporates Geometry clues into the network to Spark up feature learning and downsampling. GeoSpark can be easily integrated into various backbones. For feature aggregation, it improves feature modeling by allowing the network to learn from both local points and neighboring geometry partitions, resulting in an enlarged data-tailored receptive field. Additionally, GeoSpark utilizes geometry partition information to guide the downsampling process, where points with unique features are preserved while redundant points are fused, resulting in better preservation of key points throughout the network. We observed consistent improvements after adding GeoSpark to various backbones including PointNet++, KPConv, and PointTransformer. Notably, when integrated with Point Transformer, our GeoSpark module achieves a 74.7% mIoU on the ScanNetv2 dataset (4.1% improvement) and 71.5% mIoU on the S3DIS Area 5 dataset (1.1% improvement), ranking top on both benchmarks. Code and models will be made publicly available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes