CVAug 1, 2024

Harnessing Uncertainty-aware Bounding Boxes for Unsupervised 3D Object Detection

arXiv:2408.00619v28 citationsh-index: 4
Originality Incremental advance
AI Analysis

This addresses the challenge of inaccurate pseudo labels in unsupervised 3D object detection for autonomous driving, though it is incremental as it builds on existing clustering-based methods.

The paper tackles the problem of noise in pseudo ounding boxes for unsupervised 3D object detection by introducing an uncertainty-aware framework that estimates and regularizes uncertainty to refine training, resulting in improvements such as +6.9% AP_BEV and +2.5% AP_3D on nuScenes.

Unsupervised 3D object detection aims to identify objects of interest from unlabeled raw data, such as LiDAR points. Recent approaches usually adopt pseudo 3D bounding boxes (3D bboxes) from clustering algorithm to initialize the model training. However, pseudo bboxes inevitably contain noise, and such inaccuracies accumulate to the final model, compromising the performance. Therefore, in an attempt to mitigate the negative impact of inaccurate pseudo bboxes, we introduce a new uncertainty-aware framework for unsupervised 3D object detection, dubbed UA3D. In particular, our method consists of two phases: uncertainty estimation and uncertainty regularization. (1) In the uncertainty estimation phase, we incorporate an extra auxiliary detection branch alongside the original primary detector. The prediction disparity between the primary and auxiliary detectors could reflect fine-grained uncertainty at the box coordinate level. (2) Based on the assessed uncertainty, we adaptively adjust the weight of every 3D bbox coordinate via uncertainty regularization, refining the training process on pseudo bboxes. For pseudo bbox coordinate with high uncertainty, we assign a relatively low loss weight. Extensive experiments verify that the proposed method is robust against the noisy pseudo bboxes, yielding substantial improvements on nuScenes and Lyft compared to existing approaches, with increases of +6.9% AP$_{BEV}$ and +2.5% AP$_{3D}$ on nuScenes, and +4.1% AP$_{BEV}$ and +2.0% AP$_{3D}$ on Lyft.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes