LG AIFeb 6, 2023

Trust, but Verify: Using Self-Supervised Probing to Improve Trustworthiness

Ailin Deng, Shen Li, Miao Xiong, Zhirui Chen, Bryan Hooi

arXiv:2302.02628v16.64 citationsh-index: 10Has Code

Originality Incremental advance

AI Analysis

This work addresses the trustworthiness issue in machine learning for practical deployment, offering an incremental improvement by providing a plug-and-play framework to enhance existing methods.

The paper tackles the problem of untrustworthy predictive confidence scores in deep learning models, which are often overconfident, by introducing a self-supervised probing framework that improves trustworthiness across tasks like misclassification detection, calibration, and out-of-distribution detection, as verified through extensive experiments on multiple benchmarks.

Trustworthy machine learning is of primary importance to the practical deployment of deep learning models. While state-of-the-art models achieve astonishingly good performance in terms of accuracy, recent literature reveals that their predictive confidence scores unfortunately cannot be trusted: e.g., they are often overconfident when wrong predictions are made, or so even for obvious outliers. In this paper, we introduce a new approach of self-supervised probing, which enables us to check and mitigate the overconfidence issue for a trained model, thereby improving its trustworthiness. We provide a simple yet effective framework, which can be flexibly applied to existing trustworthiness-related methods in a plug-and-play manner. Extensive experiments on three trustworthiness-related tasks (misclassification detection, calibration and out-of-distribution detection) across various benchmarks verify the effectiveness of our proposed probing framework.

View on arXiv PDF Code

Similar