CVAILGMar 26

Activation Matters: Test-time Activated Negative Labels for OOD Detection with Vision-Language Models

arXiv:2603.2525087.91 citationsh-index: 22Has Code
Predicted impact top 18% in CV · last 90 daysOriginality Incremental advance
AI Analysis

This addresses OOD detection for vision-language models, offering a test-efficient and training-free solution that improves robustness, though it is incremental as it builds on existing negative label pipelines.

The paper tackles the problem of out-of-distribution (OOD) detection in vision-language models by proposing TANL, a method that dynamically selects negative labels based on activation levels during testing, which significantly reduces the false positive rate (FPR95) from 17.5% to 9.8% on the ImageNet benchmark.

Out-of-distribution (OOD) detection aims to identify samples that deviate from in-distribution (ID). One popular pipeline addresses this by introducing negative labels distant from ID classes and detecting OOD based on their distance to these labels. However, such labels may present poor activation on OOD samples, failing to capture the OOD characteristics. To address this, we propose \underline{T}est-time \underline{A}ctivated \underline{N}egative \underline{L}abels (TANL) by dynamically evaluating activation levels across the corpus dataset and mining candidate labels with high activation responses during the testing process. Specifically, TANL identifies high-confidence test images online and accumulates their assignment probabilities over the corpus to construct a label activation metric. Such a metric leverages historical test samples to adaptively align with the test distribution, enabling the selection of distribution-adaptive activated negative labels. By further exploring the activation information within the current testing batch, we introduce a more fine-grained, batch-adaptive variant. To fully utilize label activation knowledge, we propose an activation-aware score function that emphasizes negative labels with stronger activations, boosting performance and enhancing its robustness to the label number. Our TANL is training-free, test-efficient, and grounded in theoretical justification. Experiments on diverse backbones and wide task settings validate its effectiveness. Notably, on the large-scale ImageNet benchmark, TANL significantly reduces the FPR95 from 17.5\% to 9.8\%. Codes are available at \href{https://github.com/YBZh/OpenOOD-VLM}{YBZh/OpenOOD-VLM}.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes