Generative Prompt Internalization
This addresses efficiency issues for users of large language models in agent-based applications, though it appears incremental as it builds on existing prompt optimization techniques.
The paper tackles the computational overhead of fixed, lengthy prompts in large language model applications by proposing Generative Prompt Internalization (GenPI), a lightweight joint training method that internalizes prompts and generates their content and reasoning, achieving high performance and efficient inference without explicit prompts.
Prompts used in recent large language model based applications are often fixed and lengthy, leading to significant computational overhead. To address this challenge, we propose Generative Prompt Internalization (GenPI), a lightweight method that employs a joint training approach. GenPI not only replicates the behavior of models with prompt inputs but also generates the content of the prompt along with reasons for why the model's behavior should change accordingly. We demonstrate that our approach effectively internalizes complex prompts across various agent-based application scenarios. For effective training without interactions with the dedicated environments, we introduce a data synthesis technique that autonomously collects conversational datasets by swapping the roles of the agent and environment. This method is especially useful in scenarios where only a predefined prompt is available without a corresponding training dataset. By internalizing complex prompts, Generative Prompt Internalization enables high performance and efficient inference without the need for explicit prompts.