LGMLJun 18, 2021

Iterative Feature Matching: Toward Provable Domain Generalization with Logarithmic Environments

arXiv:2106.09913v236 citations
Originality Highly original
AI Analysis

This work addresses the challenge of domain generalization for machine learning practitioners by providing theoretical guarantees for distribution-matching algorithms, which is incremental as it builds on existing models but offers new provable bounds.

The paper tackles the problem of domain generalization with limited training environments, showing that both ERM and IRM fail with fewer than linear environments in spurious feature dimension, and presents an iterative feature matching algorithm that provably generalizes with only logarithmic environments.

Domain generalization aims at performing well on unseen test environments with data from a limited number of training environments. Despite a proliferation of proposal algorithms for this task, assessing their performance both theoretically and empirically is still very challenging. Distributional matching algorithms such as (Conditional) Domain Adversarial Networks [Ganin et al., 2016, Long et al., 2018] are popular and enjoy empirical success, but they lack formal guarantees. Other approaches such as Invariant Risk Minimization (IRM) require a prohibitively large number of training environments -- linear in the dimension of the spurious feature space $d_s$ -- even on simple data models like the one proposed by [Rosenfeld et al., 2021]. Under a variant of this model, we show that both ERM and IRM cannot generalize with $o(d_s)$ environments. We then present an iterative feature matching algorithm that is guaranteed with high probability to yield a predictor that generalizes after seeing only $O(\log d_s)$ environments. Our results provide the first theoretical justification for a family of distribution-matching algorithms widely used in practice under a concrete nontrivial data model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes