CVApr 11

Near OOD Detection for Vision-Language Prompt Learning with Contrastive Logit Score

arXiv:2405.1609151.43 citationsh-index: 5Has Code
Predicted impact top 68% in CV · last 90 daysOriginality Incremental advance
AI Analysis

It addresses the underexplored problem of near OOD detection in vision-language prompt learning, offering a simple plug-and-play solution.

The paper introduces Contrastive Logit Score (CLS), a post-hoc scoring function that improves near OOD detection for vision-language prompt learning methods by up to 11.67% AUROC without retraining or architectural changes.

Prompt learning has emerged as an efficient and effective method for fine-tuning vision-language models such as CLIP. While many studies have explored generalisation abilities of these models in few-shot classification tasks and a few studies have addressed far out-of-distribution (OOD) of the models, their potential for addressing near OOD detection remains underexplored. Existing methods either require training from scratch, need fine-tuning, or are not designed for vision-language prompt learning. To address this, we introduce the Contrastive Logit Score (CLS), a novel post-hoc, plug-and-play scoring function. CLS significantly improves near OOD detection of pre-trained vision-language prompt learning methods without modifying their model architectures or requiring retraining. Our method achieves up to an 11.67% improvement in AUROC for near OOD detection with minimal computational overhead. Extensive evaluations validate the effectiveness, efficiency, and generalisability of our approach. Our code is available at https://github.com/davidmcjung/near-OOD-prompt-learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes