MLAILGJan 4, 2021

Does Invariant Risk Minimization Capture Invariance?

arXiv:2101.01134v2149 citations
AI Analysis

This paper identifies fundamental limitations of IRM for machine learning researchers and practitioners, showing it can fail to achieve its stated goal of improved out-of-distribution generalization.

This paper demonstrates that Invariant Risk Minimization (IRM), particularly its linear variant, often fails to capture natural invariances even in simple scenarios, leading to worse generalization than unconstrained Empirical Risk Minimization (ERM). This failure is attributed to a gap between the linear and full non-linear IRM formulations, and the method's fragility to sampling.

We show that the Invariant Risk Minimization (IRM) formulation of Arjovsky et al. (2019) can fail to capture "natural" invariances, at least when used in its practical "linear" form, and even on very simple problems which directly follow the motivating examples for IRM. This can lead to worse generalization on new environments, even when compared to unconstrained ERM. The issue stems from a significant gap between the linear variant (as in their concrete method IRMv1) and the full non-linear IRM formulation. Additionally, even when capturing the "right" invariances, we show that it is possible for IRM to learn a sub-optimal predictor, due to the loss function not being invariant across environments. The issues arise even when measuring invariance on the population distributions, but are exacerbated by the fact that IRM is extremely fragile to sampling.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes