CVJul 14, 2023

LEST: Large-scale LiDAR Semantic Segmentation with Transformer

arXiv:2307.09367v12 citationsh-index: 5
AI Analysis

This addresses a critical perception problem in autonomous driving with a novel approach, though it is incremental in applying Transformers to a new domain.

The paper tackles large-scale LiDAR point cloud semantic segmentation for autonomous driving by proposing LEST, a pure Transformer architecture, which outperforms all state-of-the-art methods on nuScenes and SemanticKITTI benchmarks.

Large-scale LiDAR-based point cloud semantic segmentation is a critical task in autonomous driving perception. Almost all of the previous state-of-the-art LiDAR semantic segmentation methods are variants of sparse 3D convolution. Although the Transformer architecture is becoming popular in the field of natural language processing and 2D computer vision, its application to large-scale point cloud semantic segmentation is still limited. In this paper, we propose a LiDAR sEmantic Segmentation architecture with pure Transformer, LEST. LEST comprises two novel components: a Space Filling Curve (SFC) Grouping strategy and a Distance-based Cosine Linear Transformer, DISCO. On the public nuScenes semantic segmentation validation set and SemanticKITTI test set, our model outperforms all the other state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes