CV AI LGFeb 16, 2022

Reducing Overconfidence Predictions for Autonomous Driving Perception

Gledson Melotti, Cristiano Premebida, Jordan J. Bird, Diego R. Faria, Nuno Gonçalves

arXiv:2202.07825v27.316 citations

Originality Incremental advance

AI Analysis

This addresses a critical issue for autonomous driving and robotics by improving probabilistic interpretations in perception systems, though it is incremental as it builds on existing pre-trained networks.

The paper tackled the problem of overconfident predictions in deep learning for object recognition, which can harm autonomous driving perception systems, by proposing a probabilistic approach using Maximum Likelihood (ML) and Maximum a-Posteriori (MAP) functions based on Logit layer scores, showing promising performance compared to SoftMax and Sigmoid layers on datasets like KITTI and Lyft Level-5.

In state-of-the-art deep learning for object recognition, SoftMax and Sigmoid functions are most commonly employed as the predictor outputs. Such layers often produce overconfident predictions rather than proper probabilistic scores, which can thus harm the decision-making of `critical' perception systems applied in autonomous driving and robotics. Given this, the experiments in this work propose a probabilistic approach based on distributions calculated out of the Logit layer scores of pre-trained networks. We demonstrate that Maximum Likelihood (ML) and Maximum a-Posteriori (MAP) functions are more suitable for probabilistic interpretations than SoftMax and Sigmoid-based predictions for object recognition. We explore distinct sensor modalities via RGB images and LiDARs (RV: range-view) data from the KITTI and Lyft Level-5 datasets, where our approach shows promising performance compared to the usual SoftMax and Sigmoid layers, with the benefit of enabling interpretable probabilistic predictions. Another advantage of the approach introduced in this paper is that the ML and MAP functions can be implemented in existing trained networks, that is, the approach benefits from the output of the Logit layer of pre-trained networks. Thus, there is no need to carry out a new training phase since the ML and MAP functions are used in the test/prediction phase.

View on arXiv PDF

Similar