NegRefine: Refining Negative Label-Based Zero-Shot OOD Detection
This work solves the problem of false OOD detection for users of zero-shot methods in computer vision, though it is incremental as it refines existing negative label-based approaches.
The paper tackled the problem of zero-shot out-of-distribution detection in vision-language models by addressing issues like subcategory confusion and proper nouns in negative labels, resulting in improved robustness on benchmarks such as ImageNet-1K.
Recent advancements in Vision-Language Models like CLIP have enabled zero-shot OOD detection by leveraging both image and textual label information. Among these, negative label-based methods such as NegLabel and CSP have shown promising results by utilizing a lexicon of words to define negative labels for distinguishing OOD samples. However, these methods suffer from detecting in-distribution samples as OOD due to negative labels that are subcategories of in-distribution labels or proper nouns. They also face limitations in handling images that match multiple in-distribution and negative labels. We propose NegRefine, a novel negative label refinement framework for zero-shot OOD detection. By introducing a filtering mechanism to exclude subcategory labels and proper nouns from the negative label set and incorporating a multi-matching-aware scoring function that dynamically adjusts the contributions of multiple labels matching an image, NegRefine ensures a more robust separation between in-distribution and OOD samples. We evaluate NegRefine on large-scale benchmarks, including ImageNet-1K. The code is available at https://github.com/ah-ansari/NegRefine.