On the Value of Target Data in Transfer Learning
This work addresses the practical issue of data acquisition costs in transfer learning for machine learning practitioners, providing foundational theoretical insights.
The paper tackles the problem of quantifying the value of additional target data in transfer learning to minimize sampling costs, establishing the first minimax-rates based on source and target sample sizes and showing that performance limits depend on new discrepancy measures called transfer exponents.
We aim to understand the value of additional labeled or unlabeled target data in transfer learning, for any given amount of source data; this is motivated by practical questions around minimizing sampling costs, whereby, target data is usually harder or costlier to acquire than source data, but can yield better accuracy. To this aim, we establish the first minimax-rates in terms of both source and target sample sizes, and show that performance limits are captured by new notions of discrepancy between source and target, which we refer to as transfer exponents.