CVAILGNov 5, 2024

Self-Calibrated Tuning of Vision-Language Models for Out-of-Distribution Detection

arXiv:2411.03359v124 citationsh-index: 15NIPS
Originality Incremental advance
AI Analysis

This work addresses the reliability of machine learning models in open-world applications, representing an incremental improvement over existing CLIP-based methods.

The paper tackles the problem of out-of-distribution (OOD) detection in vision-language models by proposing Self-Calibrated Tuning (SCT) to mitigate spurious context from ID data, resulting in improved OOD detection performance as demonstrated through extensive experiments.

Out-of-distribution (OOD) detection is crucial for deploying reliable machine learning models in open-world applications. Recent advances in CLIP-based OOD detection have shown promising results via regularizing prompt tuning with OOD features extracted from ID data. However, the irrelevant context mined from ID data can be spurious due to the inaccurate foreground-background decomposition, thus limiting the OOD detection performance. In this work, we propose a novel framework, namely, Self-Calibrated Tuning (SCT), to mitigate this problem for effective OOD detection with only the given few-shot ID data. Specifically, SCT introduces modulating factors respectively on the two components of the original learning objective. It adaptively directs the optimization process between the two tasks during training on data with different prediction uncertainty to calibrate the influence of OOD regularization, which is compatible with many prompt tuning based OOD detection methods. Extensive experiments and analyses have been conducted to characterize and demonstrate the effectiveness of the proposed SCT. The code is publicly available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes