CVROMar 28, 2024

CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge Distillation

arXiv:2403.19104v144 citationsh-index: 7CVPR
Originality Incremental advance
AI Analysis

This work addresses the cost barrier of LiDAR sensors for consumer vehicles by improving cheaper camera-radar systems, though it is incremental in advancing fusion techniques.

The paper tackles the performance gap between LiDAR-Camera and Camera-Radar fusion for 3D object detection in autonomous driving by proposing a cross-modality knowledge distillation framework, achieving enhanced results on the nuScenes dataset.

In the field of 3D object detection for autonomous driving, LiDAR-Camera (LC) fusion is the top-performing sensor configuration. Still, LiDAR is relatively high cost, which hinders adoption of this technology for consumer automobiles. Alternatively, camera and radar are commonly deployed on vehicles already on the road today, but performance of Camera-Radar (CR) fusion falls behind LC fusion. In this work, we propose Camera-Radar Knowledge Distillation (CRKD) to bridge the performance gap between LC and CR detectors with a novel cross-modality KD framework. We use the Bird's-Eye-View (BEV) representation as the shared feature space to enable effective knowledge distillation. To accommodate the unique cross-modality KD path, we propose four distillation losses to help the student learn crucial features from the teacher model. We present extensive evaluations on the nuScenes dataset to demonstrate the effectiveness of the proposed CRKD framework. The project page for CRKD is https://song-jingyu.github.io/CRKD.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes