LG CVJun 15, 2023

OpenOOD v1.5: Enhanced Benchmark for Out-of-Distribution Detection

Jingyang Zhang, Jingkang Yang, Pengyun Wang, Haoqi Wang, Yueqian Lin, Haoran Zhang, Yiyou Sun, Xuefeng Du, Yixuan Li, Ziwei Liu, Yiran Chen, Hai Li

Berkeley

arXiv:2306.09301v539.4220 citationsh-index: 65Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the need for reliable and scalable evaluation benchmarks in OOD detection research, which is critical for open-world intelligent systems, but it is incremental as it builds upon the previous OpenOOD v1 version.

The paper tackles the problem of inconsistent evaluation in out-of-distribution (OOD) detection by introducing OpenOOD v1.5, an enhanced benchmark that extends to large-scale datasets like ImageNet and foundation models such as CLIP and DINOv2, and expands scope to full-spectrum OOD detection, providing standardized and comprehensive evaluation.

Out-of-Distribution (OOD) detection is critical for the reliable operation of open-world intelligent systems. Despite the emergence of an increasing number of OOD detection methods, the evaluation inconsistencies present challenges for tracking the progress in this field. OpenOOD v1 initiated the unification of the OOD detection evaluation but faced limitations in scalability and scope. In response, this paper presents OpenOOD v1.5, a significant improvement from its predecessor that ensures accurate and standardized evaluation of OOD detection methodologies at large scale. Notably, OpenOOD v1.5 extends its evaluation capabilities to large-scale data sets (ImageNet) and foundation models (e.g., CLIP and DINOv2), and expands its scope to investigate full-spectrum OOD detection which considers semantic and covariate distribution shifts at the same time. This work also contributes in-depth analysis and insights derived from comprehensive experimental results, thereby enriching the knowledge pool of OOD detection methodologies. With these enhancements, OpenOOD v1.5 aims to drive advancements and offer a more robust and comprehensive evaluation benchmark for OOD detection research.

View on arXiv PDF Code

Similar