Mysteries of the Deep: Role of Intermediate Representations in Out of Distribution Detection
This work addresses the reliability of machine learning models in real-world deployments by improving OOD detection, offering a novel approach that is incremental but with strong specific gains.
The paper tackled the problem of out-of-distribution detection by challenging the reliance on final-layer representations in pre-trained models, revealing that intermediate layers encode rich signals for distributional shifts and introducing an entropy-based method to select complementary layers, resulting in accuracy increases of up to 10% in far-OOD and over 7% in near-OOD benchmarks compared to state-of-the-art training-free methods.
Out-of-distribution (OOD) detection is essential for reliably deploying machine learning models in the wild. Yet, most methods treat large pre-trained models as monolithic encoders and rely solely on their final-layer representations for detection. We challenge this wisdom. We reveal the \textit{intermediate layers} of pre-trained models, shaped by residual connections that subtly transform input projections, \textit{can} encode \textit{surprisingly rich and diverse signals} for detecting distributional shifts. Importantly, to exploit latent representation diversity across layers, we introduce an entropy-based criterion to \textit{automatically} identify layers offering the most complementary information in a training-free setting -- \textit{without access to OOD data}. We show that selectively incorporating these intermediate representations can increase the accuracy of OOD detection by up to \textbf{$10\%$} in far-OOD and over \textbf{$7\%$} in near-OOD benchmarks compared to state-of-the-art training-free methods across various model architectures and training objectives. Our findings reveal a new avenue for OOD detection research and uncover the impact of various training objectives and model architectures on confidence-based OOD detection methods.