AICLMay 10

Do Self-Evolving Agents Forget? Capability Degradation and Preservation in Lifelong LLM Agent Adaptation

arXiv:2605.0931593.2
Predicted impact top 30% in AI · last 90 daysOriginality Incremental advance
AI Analysis

For developers of lifelong LLM agents, this work highlights a critical non-monotonic degradation problem and provides a general stabilization principle to preserve capabilities during continual adaptation.

The paper identifies capability erosion in self-evolving LLM agents across workflow, skill, model, and memory evolution channels, and proposes Capability-Preserving Evolution (CPE) to mitigate it. CPE improves retained simple-task performance from 41.8% to 52.8% under GPT-5.1 optimization while enhancing complex-task adaptation.

Recent advances in LLM agents enable systems that autonomously refine workflows, accumulate reusable skills, self-train their underlying models, and maintain persistent memory. However, we show that such self-evolution is often non-monotonic: adapting to new task distributions can progressively degrade previously acquired capabilities across all major evolution channels. We identify this phenomenon as \emph{capability erosion under self-evolution} and show that it consistently emerges across workflow, skill, model, and memory evolution. To mitigate this issue, we propose \emph{Capability-Preserving Evolution} (CPE), a general stabilization principle that constrains destructive capability drift during continual adaptation. Across all four evolution dimensions, CPE consistently improves retained capability stability while preserving adaptation performance. For example, in workflow evolution, CPE improves retained simple-task performance from 41.8\% to 52.8\% under GPT-5.1 optimization while simultaneously achieving stronger complex-task adaptation. Our findings suggest that stable long-horizon self-evolving agents require not only acquiring new capabilities, but also explicitly preserving previously learned ones during continual adaptation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes