LG MLOct 30, 2019

When MAML Can Adapt Fast and How to Assist When It Cannot

Sébastien M. R. Arnold, Shariq Iqbal, Fei Sha

arXiv:1910.13603v313.719 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the problem of understanding and improving fast adaptation in meta-learning for researchers and practitioners, offering incremental insights and methods.

The paper investigates when MAML adapts quickly and proposes new meta-optimization methods to improve it, finding that deep architectures enhance adaptation even for shallow tasks and that upper layers enable adaptive gradient updates, leading to stronger performance than MAML in meta-learning.

Model-Agnostic Meta-Learning (MAML) and its variants have achieved success in meta-learning tasks on many datasets and settings. On the other hand, we have just started to understand and analyze how they are able to adapt fast to new tasks. For example, one popular hypothesis is that the algorithms learn good representations for transfer, as in multi-task learning. In this work, we contribute by providing a series of empirical and theoretical studies, and discover several interesting yet previously unknown properties of the algorithm. We find MAML adapts better with a deep architecture even if the tasks need only a shallow one (and thus, no representation learning is needed). While echoing previous findings by others that the bottom layers in deep architectures enable representation learning, we also find that upper layers enable fast adaptation by being meta-learned to perform adaptive gradient update when generalizing to new tasks. Motivated by these findings, we study several meta-optimization approaches and propose a new one for learning to optimize adaptively. Those approaches attain stronger performance in meta-learning both shallower and deeper architectures than MAML.

View on arXiv PDF Code

Similar