LGSep 29, 2022

Multiple Modes for Continual Learning

arXiv:2209.14996v12 citationsh-index: 68
Originality Incremental advance
AI Analysis

This addresses the challenge of scalable deep learning for systems that need to adapt to streaming data without forgetting old tasks, though it appears incremental as it builds on prior continual learning strategies.

The paper tackles the problem of forgetting in continual learning by proposing MOTA, which trains multiple parameter modes and optimizes task allocation per mode, showing empirical improvements over baseline strategies across various distribution shifts.

Adapting model parameters to incoming streams of data is a crucial factor to deep learning scalability. Interestingly, prior continual learning strategies in online settings inadvertently anchor their updated parameters to a local parameter subspace to remember old tasks, else drift away from the subspace and forget. From this observation, we formulate a trade-off between constructing multiple parameter modes and allocating tasks per mode. Mode-Optimized Task Allocation (MOTA), our contributed adaptation strategy, trains multiple modes in parallel, then optimizes task allocation per mode. We empirically demonstrate improvements over baseline continual learning strategies and across varying distribution shifts, namely sub-population, domain, and task shift.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes