FADet: A Multi-sensor 3D Object Detection Network based on Local Featured Attention
This work addresses the problem of multi-sensor fusion for autonomous driving, presenting an incremental improvement with novel attention modules tailored to each sensor type.
The paper tackles the challenge of robust 3D object detection by fusing camera, LiDAR, and radar sensors using local featured attention modules, achieving state-of-the-art performance on the NuScenes dataset with 71.8% NDS and 69.0% mAP for LiDAR-camera tasks and 51.7% NDS and 40.3% mAP for radar-camera tasks.
Camera, LiDAR and radar are common perception sensors for autonomous driving tasks. Robust prediction of 3D object detection is optimally based on the fusion of these sensors. To exploit their abilities wisely remains a challenge because each of these sensors has its own characteristics. In this paper, we propose FADet, a multi-sensor 3D detection network, which specifically studies the characteristics of different sensors based on our local featured attention modules. For camera images, we propose dual-attention-based sub-module. For LiDAR point clouds, triple-attention-based sub-module is utilized while mixed-attention-based sub-module is applied for features of radar points. With local featured attention sub-modules, our FADet has effective detection results in long-tail and complex scenes from camera, LiDAR and radar input. On NuScenes validation dataset, FADet achieves state-of-the-art performance on LiDAR-camera object detection tasks with 71.8% NDS and 69.0% mAP, at the same time, on radar-camera object detection tasks with 51.7% NDS and 40.3% mAP. Code will be released at https://github.com/ZionGo6/FADet.