CLAIMar 20, 2022

Continual Sequence Generation with Adaptive Compositional Modules

Georgia Tech
arXiv:2203.10652v2650 citationsh-index: 34Has Code
AI Analysis

This addresses continual learning for sequence generation models, enabling adaptation to new tasks without forgetting old ones, though it appears incremental as it builds on existing transformer and continual learning methods.

The paper tackles the problem of catastrophic forgetting and inefficient parameter usage in continual sequence generation by proposing adaptive compositional modules that selectively add or reuse transformer modules based on task similarity, achieving state-of-the-art performance and parameter efficiency across various generation tasks.

Continual learning is essential for real-world deployment when there is a need to quickly adapt the model to new tasks without forgetting knowledge of old tasks. Existing work on continual sequence generation either always reuses existing parameters to learn new tasks, which is vulnerable to catastrophic forgetting on dissimilar tasks, or blindly adds new parameters for every new task, which could prevent knowledge sharing between similar tasks. To get the best of both worlds, in this work, we propose continual sequence generation with adaptive compositional modules to adaptively add modules in transformer architectures and compose both old and new modules for new tasks. We also incorporate pseudo experience replay to facilitate knowledge transfer in those shared modules. Experiment results on various sequences of generation tasks show that our framework can adaptively add modules or reuse modules based on task similarity, outperforming state-of-the-art baselines in terms of both performance and parameter efficiency. We make our code public at https://github.com/GT-SALT/Adaptive-Compositional-Modules.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes