On the calibration of underrepresented classes in LiDAR-based semantic segmentation
This addresses safety concerns for vulnerable road users in autonomous driving by providing insights for model selection, though it is incremental as it focuses on evaluation rather than new methods.
The paper tackles the problem of evaluating confidence calibration for underrepresented classes in LiDAR-based semantic segmentation, finding that calibration quality depends on predictive performance and comparing three models with deterministic and probabilistic versions.
The calibration of deep learning-based perception models plays a crucial role in their reliability. Our work focuses on a class-wise evaluation of several model's confidence performance for LiDAR-based semantic segmentation with the aim of providing insights into the calibration of underrepresented classes. Those classes often include VRUs and are thus of particular interest for safety reasons. With the help of a metric based on sparsification curves we compare the calibration abilities of three semantic segmentation models with different architectural concepts, each in a in deterministic and a probabilistic version. By identifying and describing the dependency between the predictive performance of a class and the respective calibration quality we aim to facilitate the model selection and refinement for safety-critical applications.