MMLF: Multi-modal Multi-class Late Fusion for Object Detection with Uncertainty Estimation
This work addresses the need for reliable and transparent object detection in autonomous driving, though it appears incremental as it builds on existing late fusion techniques.
The paper tackles the problem of multi-modal object detection for autonomous driving by proposing a late fusion method that integrates information from multiple modalities at the decision level, resulting in substantial performance improvements on the KITTI validation and test datasets.
Autonomous driving necessitates advanced object detection techniques that integrate information from multiple modalities to overcome the limitations associated with single-modal approaches. The challenges of aligning diverse data in early fusion and the complexities, along with overfitting issues introduced by deep fusion, underscore the efficacy of late fusion at the decision level. Late fusion ensures seamless integration without altering the original detector's network structure. This paper introduces a pioneering Multi-modal Multi-class Late Fusion method, designed for late fusion to enable multi-class detection. Fusion experiments conducted on the KITTI validation and official test datasets illustrate substantial performance improvements, presenting our model as a versatile solution for multi-modal object detection in autonomous driving. Moreover, our approach incorporates uncertainty analysis into the classification fusion process, rendering our model more transparent and trustworthy and providing more reliable insights into category predictions.