CVAug 15, 2024

Co-Fix3D: Enhancing 3D Object Detection with Collaborative Refinement

Wenxuan Li, Qin Zou, Chi Chen, Bo Du, Long Chen, Jian Zhou, Hongkai Yu

arXiv:2408.07999v25.22 citationsh-index: 93Has Code

Originality Incremental advance

AI Analysis

This work addresses perception challenges for autonomous driving systems, but it appears incremental as it builds on existing detection frameworks with specific refinements.

The paper tackles the problem of 3D object detection in driving scenarios by proposing Co-Fix3D, a framework that integrates Local and Global Enhancement modules to refine features, achieving 69.4% mAP and 73.5% NDS on the nuScenes LiDAR benchmark and 72.3% mAP and 74.7% NDS on the multimodal benchmark.

3D object detection in driving scenarios faces the challenge of complex road environments, which can lead to the loss or incompleteness of key features, thereby affecting perception performance. To address this issue, we propose an advanced detection framework called Co-Fix3D. Co-Fix3D integrates Local and Global Enhancement (LGE) modules to refine Bird's Eye View (BEV) features. The LGE module uses Discrete Wavelet Transform (DWT) for pixel-level local optimization and incorporates an attention mechanism for global optimization. To handle varying detection difficulties, we adopt multi-head LGE modules, enabling each module to focus on targets with different levels of detection complexity, thus further enhancing overall perception capability. Experimental results show that on the nuScenes dataset's LiDAR benchmark, Co-Fix3D achieves 69.4\% mAP and 73.5\% NDS, while on the multimodal benchmark, it achieves 72.3\% mAP and 74.7\% NDS. The source code is publicly available at \href{https://github.com/rubbish001/Co-Fix3d}{https://github.com/rubbish001/Co-Fix3d}.

View on arXiv PDF Code

Similar