LGAICLOct 18, 2024

Collaboratively adding new knowledge to an LLM

arXiv:2410.14753v22 citationsh-index: 22Has Code
Originality Synthesis-oriented
AI Analysis

This addresses the practical challenge of continual learning for LLMs, though it appears incremental as it compares existing techniques like LoRA, model merging, and replay.

The paper tackles the problem of sequentially adding new knowledge to an LLM while preserving previously learned information, finding that LoRA-based methods generally outperform full fine-tuning in both semi-cooperative and fully-cooperative settings.

We address the question of how to successively add new knowledge to an LLM whilst retaining previously-added knowledge. We consider two settings, semi-cooperative and fully-cooperative. Overall, LoRA performs better in most cases than full-fine tuning of all parameters when both new knowledge acquisition and retention of old, including recent, knowledge are taken into account. In the semi-cooperative setting, where datasets are not available after training, MOE mixing, model merging, and LoRA-based orthogonal subspace sequential learning, using a small weight on the orthogonality term, perform well. In the fully-cooperative setting where datasets remain available, joint training and sequential training with replay are both effective approaches with LoRA training generally preferable to full fine-tuning. The codes needed to reproduce the results are provided in an open source repository.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes