LAMOL: LAnguage MOdeling for Lifelong Language Learning
This addresses catastrophic forgetting in language models for AI systems that need to learn multiple tasks over time, representing a novel approach in a domain with limited prior research.
The paper tackles the problem of lifelong language learning by introducing LAMOL, a method that uses language modeling to generate pseudo-samples of previous tasks to prevent catastrophic forgetting, achieving performance only 2-3% worse than multitasking on five sequential language tasks.
Most research on lifelong learning applies to images or games, but not language. We present LAMOL, a simple yet effective method for lifelong language learning (LLL) based on language modeling. LAMOL replays pseudo-samples of previous tasks while requiring no extra memory or model capacity. Specifically, LAMOL is a language model that simultaneously learns to solve the tasks and generate training samples. When the model is trained for a new task, it generates pseudo-samples of previous tasks for training alongside data for the new task. The results show that LAMOL prevents catastrophic forgetting without any sign of intransigence and can perform five very different language tasks sequentially with only one model. Overall, LAMOL outperforms previous methods by a considerable margin and is only 2-3% worse than multitasking, which is usually considered the LLL upper bound. The source code is available at https://github.com/jojotenya/LAMOL.