No Loss, No Gain: Gated Refinement and Adaptive Compression for Prompt Optimization
This work addresses the challenge of efficiently generating effective prompts for LLMs, offering a scalable solution with significant performance gains and reduced computational costs, though it is incremental in improving existing optimization methods.
The paper tackles the problem of automatic prompt optimization for large language models, which often gets stuck in local optima, by proposing GRACE with gated refinement and adaptive compression strategies, achieving average relative performance improvements of 4.7%, 4.4%, and 2.7% over state-of-the-art methods on 11 tasks while using only 25% of the prompt generation budget.
Prompt engineering is crucial for leveraging the full potential of large language models (LLMs). While automatic prompt optimization offers a scalable alternative to costly manual design, generating effective prompts remains challenging. Existing methods often struggle to stably generate improved prompts, leading to low efficiency, and overlook that prompt optimization easily gets trapped in local optima. Addressing this, we propose GRACE, a framework that integrates two synergistic strategies: Gated Refinement and Adaptive Compression, achieving Efficient prompt optimization. The gated refinement strategy introduces a feedback regulation gate and an update rejection gate, which refine update signals to produce stable and effective prompt improvements. When optimization stagnates, the adaptive compression strategy distills the prompt's core concepts, restructuring the optimization trace and opening new paths. By strategically introducing information loss through refinement and compression, GRACE delivers substantial gains in performance and efficiency. In extensive experiments on 11 tasks across three practical domains, including BIG-Bench Hard (BBH), domain-specific, and general NLP tasks, GRACE achieves significant average relative performance improvements of 4.7%, 4.4% and 2.7% over state-of-the-art methods, respectively. Further analysis shows that GRACE achieves these gains using only 25% of the prompt generation budget required by prior methods, highlighting its high optimization efficiency and low computational overhead. Our code is available at https://github.com/Eric8932/GRACE.