LGJun 12, 2025

Improving Group Robustness on Spurious Correlation via Evidential Alignment

arXiv:2506.11347v33 citationsh-index: 10KDD
Originality Incremental advance
AI Analysis

This addresses the challenge of improving model trustworthiness and generalization in AI systems, particularly for applications where spurious correlations are common, though it is incremental as it builds on existing debiasing methods by removing the need for costly annotations.

The paper tackles the problem of deep neural networks relying on spurious correlations, which harms out-of-distribution robustness, by proposing Evidential Alignment, a framework that uses uncertainty quantification to debias models without needing group annotations, achieving significant improvements in group robustness across various architectures and data modalities.

Deep neural networks often learn and rely on spurious correlations, i.e., superficial associations between non-causal features and the targets. For instance, an image classifier may identify camels based on the desert backgrounds. While it can yield high overall accuracy during training, it degrades generalization on more diverse scenarios where such correlations do not hold. This problem poses significant challenges for out-of-distribution robustness and trustworthiness. Existing methods typically mitigate this issue by using external group annotations or auxiliary deterministic models to learn unbiased representations. However, such information is costly to obtain, and deterministic models may fail to capture the full spectrum of biases learned by the models. To address these limitations, we propose Evidential Alignment, a novel framework that leverages uncertainty quantification to understand the behavior of the biased models without requiring group annotations. By quantifying the evidence of model prediction with second-order risk minimization and calibrating the biased models with the proposed evidential calibration technique, Evidential Alignment identifies and suppresses spurious correlations while preserving core features. We theoretically justify the effectiveness of our method as capable of learning the patterns of biased models and debiasing the model without requiring any spurious correlation annotations. Empirical results demonstrate that our method significantly improves group robustness across diverse architectures and data modalities, providing a scalable and principled solution to spurious correlations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes