MLLGApr 29, 2023

Limits of Model Selection under Transfer Learning

arXiv:2305.00152v47 citationsh-index: 19
Originality Incremental advance
AI Analysis

This addresses a foundational gap in transfer learning theory for practitioners, but it is incremental as it builds on existing theoretical frameworks.

The paper tackles the problem of model selection in transfer learning, where the transfer distance between source and target distributions affects performance, and finds that adaptive rates without distributional information can be arbitrarily slower than oracle rates with such knowledge.

Theoretical studies on transfer learning or domain adaptation have so far focused on situations with a known hypothesis class or model; however in practice, some amount of model selection is usually involved, often appearing under the umbrella term of hyperparameter-tuning: for example, one may think of the problem of tuning for the right neural network architecture towards a target task, while leveraging data from a related source task. Now, in addition to the usual tradeoffs on approximation vs estimation errors involved in model selection, this problem brings in a new complexity term, namely, the transfer distance between source and target distributions, which is known to vary with the choice of hypothesis class. We present a first study of this problem, focusing on classification; in particular, the analysis reveals some remarkable phenomena: adaptive rates, i.e., those achievable with no distributional information, can be arbitrarily slower than oracle rates, i.e., when given knowledge on distances.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes