AICLLGSep 21, 2024

StateAct: Enhancing LLM Base Agents via Self-prompting and State-tracking

arXiv:2410.02810v315 citationsh-index: 2Has Code
AI Analysis

This work addresses a key bottleneck in LLM agents for tasks like robotics and web navigation, offering a scalable foundation without additional training, though it is incremental as it builds on existing base agent methods.

The paper tackles the problem of long-context reasoning and goal adherence in LLM-based autonomous agents by introducing StateAct, a base agent that uses self-prompting and state-tracking, resulting in performance gains of over 10% on Alfworld, 30% on Textcraft, and 7% on Webshop compared to ReAct.

Large language models (LLMs) are increasingly used as autonomous agents, tackling tasks from robotics to web navigation. Their performance depends on the underlying base agent. Existing methods, however, struggle with long-context reasoning and goal adherence. We introduce StateAct, a novel and efficient base agent that enhances decision-making through (1) self-prompting, which reinforces task goals at every step, and (2) chain-of-states, an extension of chain-of-thought that tracks state information over time. StateAct outperforms ReAct, the previous best base agent, by over 10% on Alfworld, 30% on Textcraft, and 7% on Webshop across multiple frontier LLMs. We also demonstrate that StateAct can be used as a drop-in replacement for ReAct with advanced LLM agent methods such as test-time scaling, yielding an additional 12% gain on Textcraft. By improving efficiency and long-range reasoning without requiring additional training or retrieval, StateAct provides a scalable foundation for LLM agents. We open source our code to support further research at https://github.com/ai-nikolai/stateact .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes