CLAILGOct 29, 2024

Auto-Intent: Automated Intent Discovery and Self-Exploration for Large Language Model Web Agents

arXiv:2410.22552v132 citationsh-index: 18EMNLP
Originality Incremental advance
AI Analysis

This addresses the challenge of efficient domain adaptation for LLM-based web agents, offering a method that is incremental as it builds on existing pre-trained models.

The paper tackles the problem of adapting pre-trained large language models as web navigation agents without fine-tuning, by discovering intents from demonstrations and using them to enhance decision-making, resulting in substantial performance improvements on benchmarks like Mind2Web and WebArena.

In this paper, we introduce Auto-Intent, a method to adapt a pre-trained large language model (LLM) as an agent for a target domain without direct fine-tuning, where we empirically focus on web navigation tasks. Our approach first discovers the underlying intents from target domain demonstrations unsupervisedly, in a highly compact form (up to three words). With the extracted intents, we train our intent predictor to predict the next intent given the agent's past observations and actions. In particular, we propose a self-exploration approach where top-k probable intent predictions are provided as a hint to the pre-trained LLM agent, which leads to enhanced decision-making capabilities. Auto-Intent substantially improves the performance of GPT-{3.5, 4} and Llama-3.1-{70B, 405B} agents on the large-scale real-website navigation benchmarks from Mind2Web and online navigation tasks from WebArena with its cross-benchmark generalization from Mind2Web.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes