DisCoPatch: Taming Adversarially-driven Batch Statistics for Improved Out-of-Distribution Detection
It addresses covariate shift detection to improve OOD detection for machine learning applications, offering an efficient solution with a compact model size and lower latency, though it appears incremental as it builds on existing adversarial and VAE methods.
The paper tackles the problem of detecting subtle covariate shifts for out-of-distribution (OOD) detection by introducing DisCoPatch, an unsupervised adversarial VAE framework that exploits batch statistics from adversarial discriminators, achieving state-of-the-art results with 95.5% AUROC on ImageNet-1K(-C) and 95.0% on Near-OOD benchmarks.
Out-of-distribution (OOD) detection holds significant importance across many applications. While semantic and domain-shift OOD problems are well-studied, this work focuses on covariate shifts - subtle variations in the data distribution that can degrade machine learning performance. We hypothesize that detecting these subtle shifts can improve our understanding of in-distribution boundaries, ultimately improving OOD detection. In adversarial discriminators trained with Batch Normalization (BN), real and adversarial samples form distinct domains with unique batch statistics - a property we exploit for OOD detection. We introduce DisCoPatch, an unsupervised Adversarial Variational Autoencoder (VAE) framework that harnesses this mechanism. During inference, batches consist of patches from the same image, ensuring a consistent data distribution that allows the model to rely on batch statistics. DisCoPatch uses the VAE's suboptimal outputs (generated and reconstructed) as negative samples to train the discriminator, thereby improving its ability to delineate the boundary between in-distribution samples and covariate shifts. By tightening this boundary, DisCoPatch achieves state-of-the-art results in public OOD detection benchmarks. The proposed model not only excels in detecting covariate shifts, achieving 95.5% AUROC on ImageNet-1K(-C) but also outperforms all prior methods on public Near-OOD (95.0%) benchmarks. With a compact model size of 25MB, it achieves high OOD detection performance at notably lower latency than existing methods, making it an efficient and practical solution for real-world OOD detection applications. The code is publicly available.