Hierarchical Prompting Assists Large Language Model on Web Navigation
This work addresses a bottleneck in LLM performance for interactive tasks like web navigation, offering an incremental improvement in handling long observation traces.
The paper tackles the problem of large language models struggling with complex observations in interactive decision-making tasks by proposing a hierarchical prompting approach that first summarizes observations to be more relevant and condensed. This method improves task success rate by 6.2% over previous state-of-the-art prompting in web navigation.
Large language models (LLMs) struggle on processing complicated observations in interactive decision making tasks. To alleviate this issue, we propose a simple hierarchical prompting approach. Diverging from previous prompting approaches that always put the full observation (e.g. a web page) to the prompt, we propose to first construct an action-aware observation which is more condensed and relevant with a dedicated SUMMARIZER prompt. The ACTOR prompt then predicts the next action based on the summarized observation. While our method has broad applicability, we particularly demonstrate its efficacy in the complex domain of web navigation where a full observation often contains redundant and irrelevant information. Our approach outperforms the previous state-of-the-art prompting mechanics by 6.2% on task success rate, demonstrating its potential on interactive decision making tasks with long observation traces.