CLApr 26, 2021

Evaluating the Values of Sources in Transfer Learning

arXiv:2104.12567v1728 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of optimizing transfer learning by efficiently evaluating multiple sources, which is incremental as it builds on existing Shapley value methods for a specific bottleneck in NLP.

The paper tackles the problem of identifying which data sources are most useful for transfer learning in NLP by developing SEAL-Shap, a framework based on Shapley values to quantify source values, with experiments showing it effectively selects useful sources and aligns with intuitive source-target similarity.

Transfer learning that adapts a model trained on data-rich sources to low-resource targets has been widely applied in natural language processing (NLP). However, when training a transfer model over multiple sources, not every source is equally useful for the target. To better transfer a model, it is essential to understand the values of the sources. In this paper, we develop SEAL-Shap, an efficient source valuation framework for quantifying the usefulness of the sources (e.g., domains/languages) in transfer learning based on the Shapley value method. Experiments and comprehensive analyses on both cross-domain and cross-lingual transfers demonstrate that our framework is not only effective in choosing useful transfer sources but also the source values match the intuitive source-target similarity.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes