CV ROMar 10, 2025

Temporal Overlapping Prediction: A Self-supervised Pre-training Method for LiDAR Moving Object Segmentation

Ziliang Miao, Runjian Chen, Yixi Cai, Buwei He, Wenquan Zhao, Wenqi Shao, Bo Zhang, Fu Zhang

arXiv:2503.07167v2h-index: 26

Originality Incremental advance

AI Analysis

This addresses the labeling burden for autonomous systems like self-driving vehicles, though it is incremental as it builds on existing self-supervised learning approaches.

The paper tackles the problem of moving object segmentation in LiDAR point clouds by proposing Temporal Overlapping Prediction, a self-supervised pre-training method that reduces reliance on manual annotations, achieving up to 28.77% relative improvement over baselines on datasets like nuScenes and SemanticKITTI.

Moving object segmentation (MOS) on LiDAR point clouds is crucial for autonomous systems like self-driving vehicles. Previous supervised approaches rely heavily on costly manual annotations, while LiDAR sequences naturally capture temporal motion cues that can be leveraged for self-supervised learning. In this paper, we propose Temporal Overlapping Prediction (TOP), a self-supervised pre-training method that alleviate the labeling burden for MOS. TOP explores the temporal overlapping points that commonly observed by current and adjacent scans, and learns spatiotemporal representations by predicting the occupancy states of temporal overlapping points. Moreover, we utilize current occupancy reconstruction as an auxiliary pre-training objective, which enhances the current structural awareness of the model. We conduct extensive experiments and observe that the conventional metric Intersection-over-Union (IoU) shows strong bias to objects with more scanned points, which might neglect small or distant objects. To compensate for this bias, we introduce an additional metric called mIoU_obj to evaluate object-level performance. Experiments on nuScenes and SemanticKITTI show that TOPoutperforms both supervised training-from-scratch baseline and other self-supervised pre-training baselines by up to 28.77% relative improvement, demonstrating strong transferability across LiDAR setups and generalization to other tasks. Code and pre-trained models will be publicly available upon publication.

View on arXiv PDF

Similar