LGJan 27, 2023

Projected Subnetworks Scale Adaptation

arXiv:2301.11487v1h-index: 68
Originality Incremental advance
AI Analysis

This addresses the issue of catastrophic forgetting in meta-learning for large models, which is incremental as it builds on existing gradient-based methods.

The paper tackles the problem of updating large zero/few-shot learning models on new tasks without degrading performance on previous tasks, achieving improvements in retaining seen and zero/few-shot task performance in online settings.

Large models support great zero-shot and few-shot capabilities. However, updating these models on new tasks can break performance on previous seen tasks and their zero/few-shot unseen tasks. Our work explores how to update zero/few-shot learners such that they can maintain performance on seen/unseen tasks of previous tasks as well as new tasks. By manipulating the parameter updates of a gradient-based meta learner as the projected task-specific subnetworks, we show improvements for large models to retain seen and zero/few shot task performance in online settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes