LGJul 22, 2025

Should Bias Always be Eliminated? A Principled Framework to Use Data Bias for OOD Generation

Yan Li, Guangyi Chen, Yunlong Deng, Zijian Li, Zeyu Tang, Anpeng Wu, Kun Zhang

Stanford

arXiv:2507.17001v14.1h-index: 13

Originality Incremental advance

AI Analysis

This addresses domain generalization for machine learning practitioners by offering a novel perspective on bias utilization, though it builds incrementally on invariant representation learning.

The paper tackles the problem of adapting models to out-of-distribution domains by proposing a framework that strategically leverages data bias instead of eliminating it, and results show it outperforms existing approaches on benchmarks.

Most existing methods for adapting models to out-of-distribution (OOD) domains rely on invariant representation learning to eliminate the influence of biased features. However, should bias always be eliminated -- and if not, when should it be retained, and how can it be leveraged? To address these questions, we first present a theoretical analysis that explores the conditions under which biased features can be identified and effectively utilized. Building on this theoretical foundation, we introduce a novel framework that strategically leverages bias to complement invariant representations during inference. The framework comprises two key components that leverage bias in both direct and indirect ways: (1) using invariance as guidance to extract predictive ingredients from bias, and (2) exploiting identified bias to estimate the environmental condition and then use it to explore appropriate bias-aware predictors to alleviate environment gaps. We validate our approach through experiments on both synthetic datasets and standard domain generalization benchmarks. Results consistently demonstrate that our method outperforms existing approaches, underscoring its robustness and adaptability.

View on arXiv PDF

Similar