CVRONov 26, 2023

CalibFormer: A Transformer-based Automatic LiDAR-Camera Calibration Network

arXiv:2311.15241v225 citationsh-index: 7
Originality Incremental advance
AI Analysis

This addresses the cumbersome and costly calibration process for sensor fusion in autonomous driving, though it is an incremental improvement over prior learning-based methods.

The paper tackled the problem of automatic LiDAR-camera calibration for autonomous driving by proposing CalibFormer, an end-to-end network that achieved a mean translation error of 0.8751 cm and a mean rotation error of 0.0562° on the KITTI dataset, surpassing existing state-of-the-art methods.

The fusion of LiDARs and cameras has been increasingly adopted in autonomous driving for perception tasks. The performance of such fusion-based algorithms largely depends on the accuracy of sensor calibration, which is challenging due to the difficulty of identifying common features across different data modalities. Previously, many calibration methods involved specific targets and/or manual intervention, which has proven to be cumbersome and costly. Learning-based online calibration methods have been proposed, but their performance is barely satisfactory in most cases. These methods usually suffer from issues such as sparse feature maps, unreliable cross-modality association, inaccurate calibration parameter regression, etc. In this paper, to address these issues, we propose CalibFormer, an end-to-end network for automatic LiDAR-camera calibration. We aggregate multiple layers of camera and LiDAR image features to achieve high-resolution representations. A multi-head correlation module is utilized to identify correlations between features more accurately. Lastly, we employ transformer architectures to estimate accurate calibration parameters from the correlation information. Our method achieved a mean translation error of $0.8751 \mathrm{cm}$ and a mean rotation error of $0.0562 ^{\circ}$ on the KITTI dataset, surpassing existing state-of-the-art methods and demonstrating strong robustness, accuracy, and generalization capabilities.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes