DICE: Leveraging Sparsification for Out-of-Distribution Detection
This addresses the challenge of OOD detection for real-world ML safety, but it is incremental as it builds on existing methods by incorporating sparsification.
The paper tackles the problem of detecting out-of-distribution (OOD) inputs for safe deployment of machine learning models by proposing DICE, a sparsification-based framework that improves OOD detection by reducing output variance and enhancing separability from in-distribution data, achieving competitive performance on benchmarks.
Detecting out-of-distribution (OOD) inputs is a central challenge for safely deploying machine learning models in the real world. Previous methods commonly rely on an OOD score derived from the overparameterized weight space, while largely overlooking the role of sparsification. In this paper, we reveal important insights that reliance on unimportant weights and units can directly attribute to the brittleness of OOD detection. To mitigate the issue, we propose a sparsification-based OOD detection framework termed DICE. Our key idea is to rank weights based on a measure of contribution, and selectively use the most salient weights to derive the output for OOD detection. We provide both empirical and theoretical insights, characterizing and explaining the mechanism by which DICE improves OOD detection. By pruning away noisy signals, DICE provably reduces the output variance for OOD data, resulting in a sharper output distribution and stronger separability from ID data. We demonstrate the effectiveness of sparsification-based OOD detection on several benchmarks and establish competitive performance.