CVApr 3, 2024

DPFT: Dual Perspective Fusion Transformer for Camera-Radar-based Object Detection

arXiv:2404.03015v235 citationsh-index: 10Has CodeIEEE Trans Intell Veh
Originality Incremental advance
AI Analysis

This addresses the need for cost-effective and reliable perception in autonomous driving, though it is an incremental improvement over existing camera-radar fusion methods.

The paper tackles the problem of robust and efficient object detection for autonomous vehicles by proposing a camera-radar fusion method that uses lower-level radar data and dual-plane projections, achieving state-of-the-art performance on the K-Radar dataset with improved robustness in adverse weather and low inference time.

The perception of autonomous vehicles has to be efficient, robust, and cost-effective. However, cameras are not robust against severe weather conditions, lidar sensors are expensive, and the performance of radar-based perception is still inferior to the others. Camera-radar fusion methods have been proposed to address this issue, but these are constrained by the typical sparsity of radar point clouds and often designed for radars without elevation information. We propose a novel camera-radar fusion approach called Dual Perspective Fusion Transformer (DPFT), designed to overcome these limitations. Our method leverages lower-level radar data (the radar cube) instead of the processed point clouds to preserve as much information as possible and employs projections in both the camera and ground planes to effectively use radars with elevation information and simplify the fusion with camera data. As a result, DPFT has demonstrated state-of-the-art performance on the K-Radar dataset while showing remarkable robustness against adverse weather conditions and maintaining a low inference time. The code is made available as open-source software under https://github.com/TUMFTM/DPFT.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes