CVOct 13, 2025

rareboost3d: a synthetic lidar dataset with enhanced rare classes

arXiv:2510.10876v1h-index: 40
Originality Incremental advance
AI Analysis

This addresses data scarcity for rare object classes in autonomous driving perception, though it is incremental as it builds on existing synthetic data approaches.

The authors tackled the long-tail problem in LiDAR point cloud datasets by creating RareBoost3D, a synthetic dataset with enhanced rare classes, and proposed a cross-domain semantic alignment method (CSC loss) to improve segmentation performance, achieving significant enhancements in model accuracy.

Real-world point cloud datasets have made significant contributions to the development of LiDAR-based perception technologies, such as object segmentation for autonomous driving. However, due to the limited number of instances in some rare classes, the long-tail problem remains a major challenge in existing datasets. To address this issue, we introduce a novel, synthetic point cloud dataset named RareBoost3D, which complements existing real-world datasets by providing significantly more instances for object classes that are rare in real-world datasets. To effectively leverage both synthetic and real-world data, we further propose a cross-domain semantic alignment method named CSC loss that aligns feature representations of the same class across different domains. Experimental results demonstrate that this alignment significantly enhances the performance of LiDAR point cloud segmentation models over real-world data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes