CLAIAug 14, 2024

Large Language Models Prompting With Episodic Memory

arXiv:2408.07465v15 citationsh-index: 17
Originality Incremental advance
AI Analysis

This work addresses prompt optimization for NLP practitioners, offering a more efficient and generalizable method, though it is incremental as it builds on existing RL and memory-based approaches.

The paper tackles the problem of resource-intensive and inadequate prompt optimization for Large Language Models in few-shot learning by proposing POEM, a reinforcement learning-based technique using episodic memory, which outperforms recent methods by over 5.3% in text classification tasks.

Prompt optimization is essential for enhancing the performance of Large Language Models (LLMs) in a range of Natural Language Processing (NLP) tasks, particularly in scenarios of few-shot learning where training examples are incorporated directly into the prompt. Despite the growing interest in optimizing prompts with few-shot examples, existing methods for prompt optimization are often resource-intensive or perform inadequately. In this work, we propose PrOmpting with Episodic Memory (POEM), a novel prompt optimization technique that is simple, efficient, and demonstrates strong generalization capabilities. We approach prompt optimization as a Reinforcement Learning (RL) challenge, using episodic memory to archive combinations of input data, permutations of few-shot examples, and the rewards observed during training. In the testing phase, we optimize the sequence of examples for each test query by selecting the sequence that yields the highest total rewards from the top-k most similar training examples in the episodic memory. Our results show that POEM outperforms recent techniques like TEMPERA and RLPrompt by over 5.3% in various text classification tasks. Furthermore, our approach adapts well to broader language understanding tasks, consistently outperforming conventional heuristic methods for ordering examples.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes