LaserMix for Semi-Supervised LiDAR Semantic Segmentation
This addresses the scalability issue in LiDAR segmentation for autonomous driving and robotics by reducing annotation costs, though it is an incremental improvement over existing semi-supervised methods.
The paper tackles the problem of costly dense annotation for LiDAR point clouds in semantic segmentation by proposing LaserMix, a semi-supervised learning method that mixes laser beams from different scans to leverage spatial cues, achieving competitive results with 2x to 5x fewer labels and improving supervised baselines by 10.8% on average.
Densely annotating LiDAR point clouds is costly, which restrains the scalability of fully-supervised learning methods. In this work, we study the underexplored semi-supervised learning (SSL) in LiDAR segmentation. Our core idea is to leverage the strong spatial cues of LiDAR point clouds to better exploit unlabeled data. We propose LaserMix to mix laser beams from different LiDAR scans, and then encourage the model to make consistent and confident predictions before and after mixing. Our framework has three appealing properties: 1) Generic: LaserMix is agnostic to LiDAR representations (e.g., range view and voxel), and hence our SSL framework can be universally applied. 2) Statistically grounded: We provide a detailed analysis to theoretically explain the applicability of the proposed framework. 3) Effective: Comprehensive experimental analysis on popular LiDAR segmentation datasets (nuScenes, SemanticKITTI, and ScribbleKITTI) demonstrates our effectiveness and superiority. Notably, we achieve competitive results over fully-supervised counterparts with 2x to 5x fewer labels and improve the supervised-only baseline significantly by 10.8% on average. We hope this concise yet high-performing framework could facilitate future research in semi-supervised LiDAR segmentation. Code is publicly available.