LGOCDec 16, 2024

Memory-Reduced Meta-Learning with Guaranteed Convergence

arXiv:2412.12030v11 citationsh-index: 3AAAI
Originality Incremental advance
AI Analysis

This work addresses memory efficiency for researchers and practitioners using meta-learning, though it is incremental as it builds on existing optimization-based approaches.

The paper tackles the high memory overhead in optimization-based meta-learning by proposing an algorithm that avoids using historical parameters/gradients, reducing memory costs while guaranteeing sublinear convergence and matching computational complexity to existing methods.

The optimization-based meta-learning approach is gaining increased traction because of its unique ability to quickly adapt to a new task using only small amounts of data. However, existing optimization-based meta-learning approaches, such as MAML, ANIL and their variants, generally employ backpropagation for upper-level gradient estimation, which requires using historical lower-level parameters/gradients and thus increases computational and memory overhead in each iteration. In this paper, we propose a meta-learning algorithm that can avoid using historical parameters/gradients and significantly reduce memory costs in each iteration compared to existing optimization-based meta-learning approaches. In addition to memory reduction, we prove that our proposed algorithm converges sublinearly with the iteration number of upper-level optimization, and the convergence error decays sublinearly with the batch size of sampled tasks. In the specific case in terms of deterministic meta-learning, we also prove that our proposed algorithm converges to an exact solution. Moreover, we quantify that the computational complexity of the algorithm is on the order of $\mathcal{O}(ε^{-1})$, which matches existing convergence results on meta-learning even without using any historical parameters/gradients. Experimental results on meta-learning benchmarks confirm the efficacy of our proposed algorithm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes