LGApr 24, 2025

Replay to Remember: Retaining Domain Knowledge in Streaming Language Models

arXiv:2504.17780v1
Originality Incremental advance
AI Analysis

This provides practical insights for deploying adaptable LLMs in resource-constrained, real-world scenarios, though it is incremental.

The paper tackled catastrophic forgetting in streaming language models by combining LoRA with minimal replay, showing that even minimal replay significantly stabilizes and partially restores domain-specific knowledge across medical, genetics, and law domains.

Continual learning in large language models (LLMs) typically encounters the critical challenge of catastrophic forgetting, where previously acquired knowledge deteriorates upon exposure to new data. While techniques like replay buffers and parameter-efficient tuning (e.g., Low-Rank Adaptation or LoRA) have been proposed, few studies investigate real-time domain adaptation under strict computational and data-stream constraints. In this paper, we demonstrate a lightweight method combining LoRA and a minimal replay mechanism in a realistic streaming setting across three diverse knowledge domains: medical question answering, genetics, and law. Using perplexity, semantic similarity, and GPT-based human-like evaluation metrics, we quantify the model's adaptation, forgetting, and recovery over time. Our experiments reveal that while catastrophic forgetting naturally occurs, even minimal replay significantly stabilizes and partially restores domain-specific knowledge. This study contributes practical insights for deploying adaptable LLMs in resource-constrained, real-world scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes