LGFeb 27, 2025

Out-of-distribution Generalization for Total Variation based Invariant Risk Minimization

Yuanchao Wang, Zhao-Rong Lai, Tianqi Zhong

arXiv:2502.19665v211.44 citationsh-index: 2Has CodeICLR

Originality Incremental advance

AI Analysis

This work addresses out-of-distribution generalization for machine learning models, but it is incremental as it builds on existing IRM-TV methods.

The paper tackles the problem of improving out-of-distribution generalization in invariant risk minimization by extending it to a Lagrangian multiplier model called OOD-TV-IRM, achieving better performance than IRM-TV in most cases.

Invariant risk minimization is an important general machine learning framework that has recently been interpreted as a total variation model (IRM-TV). However, how to improve out-of-distribution (OOD) generalization in the IRM-TV setting remains unsolved. In this paper, we extend IRM-TV to a Lagrangian multiplier model named OOD-TV-IRM. We find that the autonomous TV penalty hyperparameter is exactly the Lagrangian multiplier. Thus OOD-TV-IRM is essentially a primal-dual optimization model, where the primal optimization minimizes the entire invariant risk and the dual optimization strengthens the TV penalty. The objective is to reach a semi-Nash equilibrium where the balance between the training loss and OOD generalization is maintained. We also develop a convergent primal-dual algorithm that facilitates an adversarial learning scheme. Experimental results show that OOD-TV-IRM outperforms IRM-TV in most situations.

View on arXiv PDF Code

Similar