LGAICLMay 20, 2025

Structured Agent Distillation for Large Language Model

Harvard
arXiv:2505.13820v25 citationsh-index: 18
Originality Incremental advance
AI Analysis

This addresses the practical deployment constraints of LLM agents for users needing efficient, smaller models, though it is incremental as it builds on existing distillation and agent frameworks.

The paper tackles the high inference costs and large model sizes of LLM-based decision-making agents by proposing Structured Agent Distillation, a framework that compresses large agents into smaller student models while preserving reasoning and action fidelity. Experiments on ALFWorld, HotPotQA-ReAct, and WebShop show it outperforms baselines with significant compression and minimal performance drop.

Large language models (LLMs) exhibit strong capabilities as decision-making agents by interleaving reasoning and actions, as seen in ReAct-style frameworks. Yet, their practical deployment is constrained by high inference costs and large model sizes. We propose Structured Agent Distillation, a framework that compresses large LLM-based agents into smaller student models while preserving both reasoning fidelity and action consistency. Unlike standard token-level distillation, our method segments trajectories into {[REASON]} and {[ACT]} spans, applying segment-specific losses to align each component with the teacher's behavior. This structure-aware supervision enables compact agents to better replicate the teacher's decision process. Experiments on ALFWorld, HotPotQA-ReAct, and WebShop show that our approach consistently outperforms token-level and imitation learning baselines, achieving significant compression with minimal performance drop. Scaling and ablation results further highlight the importance of span-level alignment for efficient and deployable agents.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes