LG AINov 2, 2025

Continual Learning, Not Training: Online Adaptation For Agents

arXiv:2511.01093v11 citationsh-index: 1Has Code

Originality Highly original

AI Analysis

This addresses the challenge of real-time adaptation for deployed AI agents, offering a novel system-centric approach that is incremental in its application to continual learning.

The paper tackles the problem of enabling deployed AI agents to adapt continually without retraining by introducing ATLAS, a dual-agent architecture that shifts adaptation from model parameters to system-level orchestration, achieving 54.1% success on a cyberthreat benchmark and reducing costs by 86% compared to a larger model.

Continual Learning (CL) methods have traditionally focused on mitigating catastrophic forgetting through gradient-based retraining, an approach ill-suited for deployed agents that must adapt in real time. We introduce our Adaptive Teaching and Learning System (ATLAS), a dual-agent architecture that decouples reasoning (Teacher) from execution (Student) and incorporates a persistent learning memory that stores distilled guidance from experience. This informs the orchestration layer, enabling the system to dynamically adjust its operational strategies, such as supervision level or initial plan selection, at inference time. In doing so, ATLAS achieves gradient-free continual learning, shifting the locus of adaptation from model parameters to system-level orchestration. We formulate this as a system-centric paradigm for continual learning, where the objective is adaptive efficiency: maximizing task success while minimizing computational cost through inference-time orchestration rather than parameter updates. Evaluated on Microsoft's ExCyTIn-Bench, an open-source benchmark simulating complex cyberthreat investigation, ATLAS achieves 54.1% success with GPT-5-mini as its Student, outperforming the larger GPT-5 (High) by 13% while reducing cost by 86%. Cross-incident validation demonstrates generalization: frozen pamphlets from Incident #5 improve accuracy from 28% to 41% with zero retraining, while shifting output composition from verbose exploration to structured reasoning. Together, these findings establish gradient-free continual learning as a viable path toward adaptive, deployable AI systems and provide causally annotated traces valuable for training explicit world models.

View on arXiv PDF

Similar