LGDec 13, 2023

Accelerating Meta-Learning by Sharing Gradients

arXiv:2312.08398v12 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in meta-learning efficiency for researchers and practitioners, though it is incremental as it builds on existing gradient-based meta-learning frameworks.

The paper tackles the problem of slow meta-training due to task-specific overfitting in gradient-based meta-learning by introducing an inner loop regularization mechanism that shares gradient information across tasks. The result is a method that enables meta-learning with larger inner loop learning rates and accelerates meta-training by up to 134% on few-shot classification datasets.

The success of gradient-based meta-learning is primarily attributed to its ability to leverage related tasks to learn task-invariant information. However, the absence of interactions between different tasks in the inner loop leads to task-specific over-fitting in the initial phase of meta-training. While this is eventually corrected by the presence of these interactions in the outer loop, it comes at a significant cost of slower meta-learning. To address this limitation, we explicitly encode task relatedness via an inner loop regularization mechanism inspired by multi-task learning. Our algorithm shares gradient information from previously encountered tasks as well as concurrent tasks in the same task batch, and scales their contribution with meta-learned parameters. We show using two popular few-shot classification datasets that gradient sharing enables meta-learning under bigger inner loop learning rates and can accelerate the meta-training process by up to 134%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes