GFNet: Geometric Flow Network for 3D Point Cloud Semantic Segmentation
This work addresses the challenge of effectively combining complementary geometric information in point cloud segmentation for applications like autonomous driving, representing an incremental improvement over existing projection-based methods.
The paper tackles the problem of 3D point cloud semantic segmentation by proposing GFNet, a geometric flow network that aligns and fuses information from different projected views like range-view and bird's-eye-view, achieving state-of-the-art results on SemanticKITTI and nuScenes datasets.
Point cloud semantic segmentation from projected views, such as range-view (RV) and bird's-eye-view (BEV), has been intensively investigated. Different views capture different information of point clouds and thus are complementary to each other. However, recent projection-based methods for point cloud semantic segmentation usually utilize a vanilla late fusion strategy for the predictions of different views, failing to explore the complementary information from a geometric perspective during the representation learning. In this paper, we introduce a geometric flow network (GFNet) to explore the geometric correspondence between different views in an align-before-fuse manner. Specifically, we devise a novel geometric flow module (GFM) to bidirectionally align and propagate the complementary information across different views according to geometric relationships under the end-to-end learning scheme. We perform extensive experiments on two widely used benchmark datasets, SemanticKITTI and nuScenes, to demonstrate the effectiveness of our GFNet for project-based point cloud semantic segmentation. Concretely, GFNet not only significantly boosts the performance of each individual view but also achieves state-of-the-art results over all existing projection-based models. Code is available at \url{https://github.com/haibo-qiu/GFNet}.