CVLGMay 21, 2024

How to train your ViT for OOD Detection

arXiv:2405.17447v12 citationsh-index: 2
Originality Synthesis-oriented
AI Analysis

This work addresses improving OOD detection for computer vision applications, but it is incremental as it focuses on optimizing existing methods rather than introducing new paradigms.

The paper investigates how pretraining and finetuning schemes affect VisionTransformers' performance in out-of-distribution detection, finding that pretraining type strongly influences results and identifying a best-practice training recipe.

VisionTransformers have been shown to be powerful out-of-distribution detectors for ImageNet-scale settings when finetuned from publicly available checkpoints, often outperforming other model types on popular benchmarks. In this work, we investigate the impact of both the pretraining and finetuning scheme on the performance of ViTs on this task by analyzing a large pool of models. We find that the exact type of pretraining has a strong impact on which method works well and on OOD detection performance in general. We further show that certain training schemes might only be effective for a specific type of out-distribution, but not in general, and identify a best-practice training recipe.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes