LGFeb 14, 2021

Large-Scale Meta-Learning with Continual Trajectory Shifting

arXiv:2102.07215v318 citations
Originality Incremental advance
AI Analysis

This work addresses a technical bottleneck in meta-learning for large-scale, heterogeneous tasks, offering incremental improvements in efficiency and performance.

The paper tackles the challenge of extending meta-learning to many-shot scenarios by allowing more inner gradient steps and increasing meta-update frequency, resulting in improved generalization and convergence that outperforms previous methods and baselines.

Meta-learning of shared initialization parameters has shown to be highly effective in solving few-shot learning tasks. However, extending the framework to many-shot scenarios, which may further enhance its practicality, has been relatively overlooked due to the technical difficulties of meta-learning over long chains of inner-gradient steps. In this paper, we first show that allowing the meta-learners to take a larger number of inner gradient steps better captures the structure of heterogeneous and large-scale task distributions, thus results in obtaining better initialization points. Further, in order to increase the frequency of meta-updates even with the excessively long inner-optimization trajectories, we propose to estimate the required shift of the task-specific parameters with respect to the change of the initialization parameters. By doing so, we can arbitrarily increase the frequency of meta-updates and thus greatly improve the meta-level convergence as well as the quality of the learned initializations. We validate our method on a heterogeneous set of large-scale tasks and show that the algorithm largely outperforms the previous first-order meta-learning methods in terms of both generalization performance and convergence, as well as multi-task learning and fine-tuning baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes