LGMay 11

DeconDTN-Toolkit: A Library for Evaluation and Enhancement of Robustness to Provenance Shift

Yongsen Tan, Zhecheng Sheng, Xiruo Ding, Serguei V. S. Pakhomov, Trevor Cohen

arXiv:2605.1123765.9

AI Analysis

For researchers studying distribution shifts, this work addresses the under-explored problem of provenance shift by providing theoretical grounding and practical tools for evaluation and mitigation.

The paper formalizes the connection between provenance shift, counterfactual invariance, and invariant learning to derive a robustness objective, and introduces DeconDTN-Toolkit for evaluating and mitigating provenance shifts. It reveals ERM's vulnerability under such shifts and provides a robust OOD performance indicator.

Despite the burgeoning body of work on distribution shifts, provenance shift-where the relationship between data source and label changes at deployment-remains poorly understood and under-addressed. In this paper, we establish a formal connection between provenance shift, counterfactual invariance, and invariant learning to derive a learning objective for robustness. We then introduce \textsc{DeconDTN-Toolkit}, a specialized evaluation and remediation suite designed to simulate provenance shifts of varying degrees while maintaining the training protocol and the infrastructure of existing benchmarks. We reveal the vulnerability of Empirical Risk Minimization under provenance shift, introduce a robust out-of-distribution performance indicator, and conduct a comprehensive evaluation on existing algorithms. Our work provides both the theoretical grounding and the practical tools necessary to characterize the problem of confounding by provenance, and implementations of methods to mitigate it.

View on arXiv PDF

Similar