A Critical Evaluation of Open-World Machine Learning
This work addresses reliability issues in open-world ML systems, which is crucial for real-world deployment, but it is incremental as it builds on existing methods by combining robust classifiers with OOD detection.
The paper evaluated the resilience of open-world machine learning systems to diverse and adversarial conditions, finding that false positive rates for out-of-distribution detection can exceed 70% and reach up to 100% under certain corruptions or perturbations.
Open-world machine learning (ML) combines closed-world models trained on in-distribution data with out-of-distribution (OOD) detectors, which aim to detect and reject OOD inputs. Previous works on open-world ML systems usually fail to test their reliability under diverse, and possibly adversarial conditions. Therefore, in this paper, we seek to understand how resilient are state-of-the-art open-world ML systems to changes in system components? With our evaluation across 6 OOD detectors, we find that the choice of in-distribution data, model architecture and OOD data have a strong impact on OOD detection performance, inducing false positive rates in excess of $70\%$. We further show that OOD inputs with 22 unintentional corruptions or adversarial perturbations render open-world ML systems unusable with false positive rates of up to $100\%$. To increase the resilience of open-world ML, we combine robust classifiers with OOD detection techniques and uncover a new trade-off between OOD detection and robustness.