CVAINov 8, 2024

SimpleBEV: Improved LiDAR-Camera Fusion Architecture for 3D Object Detection

arXiv:2411.05292v123 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses 3D object detection for autonomous vehicles, presenting an incremental improvement over existing fusion methods.

The paper tackles 3D object detection for autonomous driving by improving LiDAR-camera fusion in a bird's-eye-view framework, achieving 77.6% NDS accuracy on the nuScenes dataset.

More and more research works fuse the LiDAR and camera information to improve the 3D object detection of the autonomous driving system. Recently, a simple yet effective fusion framework has achieved an excellent detection performance, fusing the LiDAR and camera features in a unified bird's-eye-view (BEV) space. In this paper, we propose a LiDAR-camera fusion framework, named SimpleBEV, for accurate 3D object detection, which follows the BEV-based fusion framework and improves the camera and LiDAR encoders, respectively. Specifically, we perform the camera-based depth estimation using a cascade network and rectify the depth results with the depth information derived from the LiDAR points. Meanwhile, an auxiliary branch that implements the 3D object detection using only the camera-BEV features is introduced to exploit the camera information during the training phase. Besides, we improve the LiDAR feature extractor by fusing the multi-scaled sparse convolutional features. Experimental results demonstrate the effectiveness of our proposed method. Our method achieves 77.6\% NDS accuracy on the nuScenes dataset, showcasing superior performance in the 3D object detection track.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes