LGMLDec 19, 2021

Rethinking Importance Weighting for Transfer Learning

arXiv:2112.10157v116 citations
Originality Synthesis-oriented
AI Analysis

This is an incremental review article that synthesizes existing research for practitioners and researchers in machine learning dealing with non-i.i.d. data.

The paper reviews transfer learning methods that address distribution shift, focusing on importance-weighting and introducing recent advances like joint and dynamic estimation, as well as causal mechanism transfer.

A key assumption in supervised learning is that training and test data follow the same probability distribution. However, this fundamental assumption is not always satisfied in practice, e.g., due to changing environments, sample selection bias, privacy concerns, or high labeling costs. Transfer learning (TL) relaxes this assumption and allows us to learn under distribution shift. Classical TL methods typically rely on importance-weighting -- a predictor is trained based on the training losses weighted according to the importance (i.e., the test-over-training density ratio). However, as real-world machine learning tasks are becoming increasingly complex, high-dimensional, and dynamical, novel approaches are explored to cope with such challenges recently. In this article, after introducing the foundation of TL based on importance-weighting, we review recent advances based on joint and dynamic importance-predictor estimation. Furthermore, we introduce a method of causal mechanism transfer that incorporates causal structure in TL. Finally, we discuss future perspectives of TL research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes