CVNov 7, 2023

3DifFusionDet: Diffusion Model for 3D Object Detection with Robust LiDAR-Camera Fusion

Xinhao Xiang, Simon Dräger, Jiawei Zhang

arXiv:2311.03742v16.88 citationsh-index: 3

Originality Incremental advance

AI Analysis

This addresses robust 3D object detection for autonomous driving, but it appears incremental as it adapts diffusion models to an existing task.

The paper tackles 3D object detection from LiDAR-Camera sensors by proposing 3DifFusionDet, a framework that structures detection as a denoising diffusion process, and it performs favorably compared to earlier detectors on the KITTI benchmark.

Good 3D object detection performance from LiDAR-Camera sensors demands seamless feature alignment and fusion strategies. We propose the 3DifFusionDet framework in this paper, which structures 3D object detection as a denoising diffusion process from noisy 3D boxes to target boxes. In this framework, ground truth boxes diffuse in a random distribution for training, and the model learns to reverse the noising process. During inference, the model gradually refines a set of boxes that were generated at random to the outcomes. Under the feature align strategy, the progressive refinement method could make a significant contribution to robust LiDAR-Camera fusion. The iterative refinement process could also demonstrate great adaptability by applying the framework to various detecting circumstances where varying levels of accuracy and speed are required. Extensive experiments on KITTI, a benchmark for real-world traffic object identification, revealed that 3DifFusionDet is able to perform favorably in comparison to earlier, well-respected detectors.

View on arXiv PDF

Similar