Characterization of Transfer Using Multi-task Learning Curves
This work addresses the challenge of understanding transfer learning mechanisms for researchers and practitioners, though it appears incremental as it builds on existing multi-task learning approaches.
The authors tackled the problem of characterizing transfer effects in machine learning by proposing to model them using multi-task learning curves that track inductive performance across varying sample sizes, rather than through gradient updates during training. Their results on a drug-target interaction benchmark show that learning curves better capture multi-task learning effects and can delineate pairwise and contextual transfer effects in foundation models.
Transfer effects manifest themselves both during training using a fixed data set and in inductive inference using accumulating data. We hypothesize that perturbing the data set by including more samples, instead of perturbing the model by gradient updates, provides a complementary and more fundamental characterization of transfer effects. To capture this phenomenon, we quantitatively model transfer effects using multi-task learning curves approximating the inductive performance over varying sample sizes. We describe an efficient method to approximate multi-task learning curves analogous to the Task Affinity Grouping method applied during training. We compare the statistical and computational approaches to transfer, which indicates considerably higher compute costs for the previous but better power and broader applicability. Evaluations are performed using a benchmark drug-target interaction data set. Our results show that learning curves can better capture the effects of multi-task learning and their multi-task extensions can delineate pairwise and contextual transfer effects in foundation models.