ML LGApr 29, 2023

Limits of Model Selection under Transfer Learning

Steve Hanneke, Samory Kpotufe, Yasaman Mahdaviyeh

arXiv:2305.00152v411.88 citationsh-index: 19

Originality Incremental advance

AI Analysis

This addresses a foundational gap in transfer learning theory for practitioners, but it is incremental as it builds on existing theoretical frameworks.

The paper tackles the problem of model selection in transfer learning, where the transfer distance between source and target distributions affects performance, and finds that adaptive rates without distributional information can be arbitrarily slower than oracle rates with such knowledge.

Theoretical studies on transfer learning or domain adaptation have so far focused on situations with a known hypothesis class or model; however in practice, some amount of model selection is usually involved, often appearing under the umbrella term of hyperparameter-tuning: for example, one may think of the problem of tuning for the right neural network architecture towards a target task, while leveraging data from a related source task. Now, in addition to the usual tradeoffs on approximation vs estimation errors involved in model selection, this problem brings in a new complexity term, namely, the transfer distance between source and target distributions, which is known to vary with the choice of hypothesis class. We present a first study of this problem, focusing on classification; in particular, the analysis reveals some remarkable phenomena: adaptive rates, i.e., those achievable with no distributional information, can be arbitrarily slower than oracle rates, i.e., when given knowledge on distances.

View on arXiv PDF

Similar