LGCVMar 21, 2022

Domain Generalization by Mutual-Information Regularization with Pre-trained Models

arXiv:2203.10789v2187 citationsh-index: 27Has Code
Originality Incremental advance
AI Analysis

This addresses domain generalization for machine learning models to handle unseen target domains, showing incremental improvements over prior methods.

The paper tackles domain generalization by reformulating the objective using mutual information with an oracle model, approximated by a pre-trained model, resulting in significant out-of-distribution performance improvements that scale with model size.

Domain generalization (DG) aims to learn a generalized model to an unseen target domain using only limited source domains. Previous attempts to DG fail to learn domain-invariant representations only from the source domains due to the significant domain shifts between training and test domains. Instead, we re-formulate the DG objective using mutual information with the oracle model, a model generalized to any possible domain. We derive a tractable variational lower bound via approximating the oracle model by a pre-trained model, called Mutual Information Regularization with Oracle (MIRO). Our extensive experiments show that MIRO significantly improves the out-of-distribution performance. Furthermore, our scaling experiments show that the larger the scale of the pre-trained model, the greater the performance improvement of MIRO. Source code is available at https://github.com/kakaobrain/miro.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes