CVSep 21, 2023

FGFusion: Fine-Grained Lidar-Camera Fusion for 3D Object Detection

arXiv:2309.11804v18 citationsh-index: 18
Originality Incremental advance
AI Analysis

This addresses 3D object detection for autonomous driving, but it is incremental as it builds on existing fusion methods.

The paper tackles the problem of information loss in lidar-camera fusion for 3D object detection by proposing FGFusion, which uses multi-scale features and fine-grained fusion, achieving improved performance on KITTI and Waymo benchmarks.

Lidars and cameras are critical sensors that provide complementary information for 3D detection in autonomous driving. While most prevalent methods progressively downscale the 3D point clouds and camera images and then fuse the high-level features, the downscaled features inevitably lose low-level detailed information. In this paper, we propose Fine-Grained Lidar-Camera Fusion (FGFusion) that make full use of multi-scale features of image and point cloud and fuse them in a fine-grained way. First, we design a dual pathway hierarchy structure to extract both high-level semantic and low-level detailed features of the image. Second, an auxiliary network is introduced to guide point cloud features to better learn the fine-grained spatial information. Finally, we propose multi-scale fusion (MSF) to fuse the last N feature maps of image and point cloud. Extensive experiments on two popular autonomous driving benchmarks, i.e. KITTI and Waymo, demonstrate the effectiveness of our method.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes