CLOct 25, 2023

PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization

Xinyuan Wang, Chenxi Li, Zhen Wang, Fan Bai, Haotian Luo, Jiayou Zhang, Nebojsa Jojic, Eric P. Xing, Zhiting Hu

arXiv:2310.16427v224.0253 citationsh-index: 97Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of reducing reliance on human experts for prompt optimization in AI applications, though it is an incremental advance building on existing planning and optimization techniques.

The paper tackles the problem of automating expert-level prompt engineering for large language models by introducing PromptAgent, a strategic planning method that uses Monte Carlo tree search and error feedback to generate high-quality prompts, achieving significant performance improvements over existing baselines across 12 tasks in domains like BIG-Bench Hard and NLP.

Highly effective, task-specific prompts are often heavily engineered by experts to integrate detailed instructions and domain insights based on a deep understanding of both instincts of large language models (LLMs) and the intricacies of the target task. However, automating the generation of such expert-level prompts remains elusive. Existing prompt optimization methods tend to overlook the depth of domain knowledge and struggle to efficiently explore the vast space of expert-level prompts. Addressing this, we present PromptAgent, an optimization method that autonomously crafts prompts equivalent in quality to those handcrafted by experts. At its core, PromptAgent views prompt optimization as a strategic planning problem and employs a principled planning algorithm, rooted in Monte Carlo tree search, to strategically navigate the expert-level prompt space. Inspired by human-like trial-and-error exploration, PromptAgent induces precise expert-level insights and in-depth instructions by reflecting on model errors and generating constructive error feedback. Such a novel framework allows the agent to iteratively examine intermediate prompts (states), refine them based on error feedbacks (actions), simulate future rewards, and search for high-reward paths leading to expert prompts. We apply PromptAgent to 12 tasks spanning three practical domains: BIG-Bench Hard (BBH), as well as domain-specific and general NLP tasks, showing it significantly outperforms strong Chain-of-Thought and recent prompt optimization baselines. Extensive analyses emphasize its capability to craft expert-level, detailed, and domain-insightful prompts with great efficiency and generalizability.

View on arXiv PDF Code

Similar