CVApr 27, 2023

A Review of Panoptic Segmentation for Mobile Mapping Point Clouds

arXiv:2304.13980v220 citationsh-index: 78
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of comprehensive 3D scene understanding for researchers and practitioners in street mapping, but it is incremental as it builds on existing methods and focuses on a specific domain.

The paper tackles the lack of work on panoptic segmentation for outdoor mobile-mapping point clouds by reviewing methods, setting up a modular pipeline for systematic experiments, and providing the first public dataset for this task. It finds that KPConv performs best but is slower, PointNet++ is fastest but performs significantly worse, and sparse CNNs are intermediate, with clustering embedding features outperforming shifted coordinates for instance segmentation.

3D point cloud panoptic segmentation is the combined task to (i) assign each point to a semantic class and (ii) separate the points in each class into object instances. Recently there has been an increased interest in such comprehensive 3D scene understanding, building on the rapid advances of semantic segmentation due to the advent of deep 3D neural networks. Yet, to date there is very little work about panoptic segmentation of outdoor mobile-mapping data, and no systematic comparisons. The present paper tries to close that gap. It reviews the building blocks needed to assemble a panoptic segmentation pipeline and the related literature. Moreover, a modular pipeline is set up to perform comprehensive, systematic experiments to assess the state of panoptic segmentation in the context of street mapping. As a byproduct, we also provide the first public dataset for that task, by extending the NPM3D dataset to include instance labels. That dataset and our source code are publicly available. We discuss which adaptations are need to adapt current panoptic segmentation methods to outdoor scenes and large objects. Our study finds that for mobile mapping data, KPConv performs best but is slower, while PointNet++ is fastest but performs significantly worse. Sparse CNNs are in between. Regardless of the backbone, Instance segmentation by clustering embedding features is better than using shifted coordinates.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes