LGNov 4, 2025

Calibration improves detection of mislabeled examples

Ilies Chibane, Thomas George, Pierre Nodet, Vincent Lemaire

arXiv:2511.02738v1h-index: 3

Originality Incremental advance

AI Analysis

This provides a practical solution for industrial applications dealing with mislabeled data, but it is incremental as it builds on existing detection methods.

The paper tackled the problem of mislabeled data in machine learning by investigating the impact of calibrating the base model used for detection, resulting in improved accuracy and robustness in identifying mislabeled instances.

Mislabeled data is a pervasive issue that undermines the performance of machine learning systems in real-world applications. An effective approach to mitigate this problem is to detect mislabeled instances and subject them to special treatment, such as filtering or relabeling. Automatic mislabeling detection methods typically rely on training a base machine learning model and then probing it for each instance to obtain a trust score that each provided label is genuine or incorrect. The properties of this base model are thus of paramount importance. In this paper, we investigate the impact of calibrating this model. Our empirical results show that using calibration methods improves the accuracy and robustness of mislabeled instance detection, providing a practical and effective solution for industrial applications.

View on arXiv PDF

Similar