CVOct 14, 2024

ROA-BEV: 2D Region-Oriented Attention for BEV-based 3D Object Detection

arXiv:2410.10298v21 citationsh-index: 5Has CodeIROS
Originality Incremental advance
AI Analysis

This addresses a specific challenge in vision-based 3D object detection for autonomous driving, representing an incremental improvement.

The paper tackles the problem of detecting 3D objects in autonomous driving when objects are visually similar to the background from a camera perspective, proposing ROA-BEV which improves performance based on BEVDepth as shown in experiments on nuScenes.

Vision-based Bird's-Eye-View (BEV) 3D object detection has recently become popular in autonomous driving. However, objects with a high similarity to the background from a camera perspective cannot be detected well by existing methods. In this paper, we propose a BEV-based 3D Object Detection Network with 2D Region-Oriented Attention (ROA-BEV), which enables the backbone to focus more on feature learning of the regions where objects exist. Moreover, our method further enhances the information feature learning ability of ROA through multi-scale structures. Each block of ROA utilizes a large kernel to ensure that the receptive field is large enough to catch information about large objects. Experiments on nuScenes show that ROA-BEV improves the performance based on BEVDepth. The source codes of this work will be available at https://github.com/DFLyan/ROA-BEV.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes