CV LGMay 9

Beyond Toy Benchmarks: A Systematic Evaluation of OOD Detection Methods For Plant Pathology Classification

arXiv:2605.086182.0

Predicted impact top 99% in CV · last 90 daysOriginality Synthesis-oriented

AI Analysis

Provides practical insights for deploying OOD detection in domain-specific applications, highlighting limitations of benchmark evaluations.

Evaluated six OOD detection methods on a real-world plant pathology dataset, finding energy-based fine-tuning outperforms others while preserving accuracy, and documented training instabilities in constrained optimization methods.

Out-of-distribution (OOD) detection is essential for reliable deployment of deep learning systems, yet the majority of existing methods are evaluated on small, visually homogeneous benchmarks. In this work, we study six OOD detection methods spanning post-hoc scoring, auxiliary objectives, energy-based models, and constrained optimization on the Plant Pathology 2021 dataset, a fine-grained task with natural distribution shifts. Energy-based fine-tuning performs best across OOD settings, improving detection over the softmax baseline while preserving in-distribution accuracy. Analysis shows these gains stem from both a restructuring of the embedding space alongside calibration of the scoring function. We further document practical training instabilities that arise when scaling constrained optimization methods to moderate-sized datasets, findings that are largely absent from existing literature. Our results demonstrate that principled OOD detection is achievable on real-world domain-specific data and that benchmark evaluations alone may not capture the challenges that emerge in practice.

View on arXiv PDF

Similar