LGAug 2, 2022

The Curse of Low Task Diversity: On the Failure of Transfer Learning to Outperform MAML and Their Empirical Equivalence

Brando Miranda, Patrick Yu, Yu-Xiong Wang, Sanmi Koyejo

arXiv:2208.01545v111.814 citationsh-index: 39

Originality Incremental advance

AI Analysis

This work addresses the meta-learning community by showing that low task diversity in benchmarks leads to empirical equivalence between MAML and transfer learning, which is incremental as it clarifies existing claims rather than introducing new methods.

The paper tackles the problem of when meta-learning algorithms like MAML outperform transfer learning in few-shot learning, finding that in benchmarks with low task diversity, such as MiniImageNet and CIFAR-FS, MAML and transfer learning have equivalent meta-test performance under fair comparisons, with accuracy similarities observed across model sizes.

Recently, it has been observed that a transfer learning solution might be all we need to solve many few-shot learning benchmarks -- thus raising important questions about when and how meta-learning algorithms should be deployed. In this paper, we seek to clarify these questions by 1. proposing a novel metric -- the diversity coefficient -- to measure the diversity of tasks in a few-shot learning benchmark and 2. by comparing Model-Agnostic Meta-Learning (MAML) and transfer learning under fair conditions (same architecture, same optimizer, and all models trained to convergence). Using the diversity coefficient, we show that the popular MiniImageNet and CIFAR-FS few-shot learning benchmarks have low diversity. This novel insight contextualizes claims that transfer learning solutions are better than meta-learned solutions in the regime of low diversity under a fair comparison. Specifically, we empirically find that a low diversity coefficient correlates with a high similarity between transfer learning and MAML learned solutions in terms of accuracy at meta-test time and classification layer similarity (using feature based distance metrics like SVCCA, PWCCA, CKA, and OPD). To further support our claim, we find this meta-test accuracy holds even as the model size changes. Therefore, we conclude that in the low diversity regime, MAML and transfer learning have equivalent meta-test performance when both are compared fairly. We also hope our work inspires more thoughtful constructions and quantitative evaluations of meta-learning benchmarks in the future.

View on arXiv PDF

Similar