AIJan 30

AutoRefine: From Trajectories to Reusable Expertise for Continual LLM Agent Refinement

arXiv:2601.22758v19 citationsh-index: 8
Originality Incremental advance
AI Analysis

It addresses the issue of knowledge degradation and lack of procedural capture in continual learning for LLM agents, offering a domain-specific solution for tasks like planning and science environments.

The paper tackles the problem of LLM agents failing to accumulate reusable knowledge from experience by introducing AutoRefine, a framework that extracts and maintains dual-form Experience Patterns from execution histories, achieving performance improvements such as 98.4% on ALFWorld and 27.1% on TravelPlanner with step reductions of 20-73%.

Large language model agents often fail to accumulate knowledge from experience, treating each task as an independent challenge. Recent methods extract experience as flattened textual knowledge, which cannot capture procedural logic of complex subtasks. They also lack maintenance mechanisms, causing repository degradation as experience accumulates. We introduce AutoRefine, a framework that extracts and maintains dual-form Experience Patterns from agent execution histories. For procedural subtasks, we extract specialized subagents with independent reasoning and memory. For static knowledge, we extract skill patterns as guidelines or code snippets. A continuous maintenance mechanism scores, prunes, and merges patterns to prevent repository degradation. Evaluated on ALFWorld, ScienceWorld, and TravelPlanner, AutoRefine achieves 98.4%, 70.4%, and 27.1% respectively, with 20-73% step reductions. On TravelPlanner, automatic extraction exceeds manually designed systems (27.1% vs 12.1%), demonstrating its ability to capture procedural coordination.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes