CVROMar 9, 2023

Rethinking Range View Representation for LiDAR Segmentation

arXiv:2303.05367v3202 citationsh-index: 58
AI Analysis

This work addresses a critical bottleneck in autonomous driving perception by making range view methods competitive, which could simplify processing and improve efficiency for real-world applications.

The paper tackled the underperformance of range view representations in LiDAR segmentation by identifying key impediments like many-to-one mapping and semantic incoherence, and introduced RangeFormer, a full-cycle framework that surpassed point-, voxel-, and multi-view fusion methods in benchmarks such as SemanticKITTI, nuScenes, and ScribbleKITTI.

LiDAR segmentation is crucial for autonomous driving perception. Recent trends favor point- or voxel-based methods as they often yield better performance than the traditional range view representation. In this work, we unveil several key factors in building powerful range view models. We observe that the "many-to-one" mapping, semantic incoherence, and shape deformation are possible impediments against effective learning from range view projections. We present RangeFormer -- a full-cycle framework comprising novel designs across network architecture, data augmentation, and post-processing -- that better handles the learning and processing of LiDAR point clouds from the range view. We further introduce a Scalable Training from Range view (STR) strategy that trains on arbitrary low-resolution 2D range images, while still maintaining satisfactory 3D segmentation accuracy. We show that, for the first time, a range view method is able to surpass the point, voxel, and multi-view fusion counterparts in the competing LiDAR semantic and panoptic segmentation benchmarks, i.e., SemanticKITTI, nuScenes, and ScribbleKITTI.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes