AIJan 13

Sparsity Is Necessary: Polynomial-Time Stability for Agentic LLMs in Large Action Spaces

arXiv:2601.08271v12.4h-index: 40

Originality Highly original

AI Analysis

This addresses the instability of tool-augmented LLM systems in sequential decision-making, providing theoretical guarantees for scalable control.

The paper tackles the problem of controlling LLMs in large action spaces by formalizing Sparse Agentic Control (SAC) and showing that sparsity is necessary for efficient learning, with estimation error scaling as k(log M / T)^{1/2} and exact support recovery possible when T > k log M.

Tool-augmented LLM systems expose a control regime that learning theory has largely ignored: sequential decision-making with a massive discrete action universe (tools, APIs, documents) in which only a small, unknown subset is relevant for any fixed task distribution. We formalize this setting as Sparse Agentic Control (SAC), where policies admit block-sparse representations over M >> 1 actions and rewards depend on sparse main effects and (optionally) sparse synergies. We study ell_{1,2}-regularized policy learning through a convex surrogate and establish sharp, compressed-sensing-style results: (i) estimation and value suboptimality scale as k (log M / T)^{1/2} under a Policy-RSC condition; (ii) exact tool-support recovery holds via primal-dual witness arguments when T > k log M under incoherence and beta-min; and (iii) any dense policy class requires Omega(M) samples, explaining the instability of prompt-only controllers. We further show that under partial observability, LLMs matter only through a belief/representation error epsilon_b, yielding an additive O(epsilon_b) degradation while preserving logarithmic dependence on M. Extensions cover tuning-free, online, robust, group-sparse, and interaction-aware SAC.

View on arXiv PDF

Similar