CLAILGJul 15, 2024

Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together

arXiv:2407.10930v246 citationsh-index: 20
Originality Incremental advance
AI Analysis

This work addresses the optimization problem for modular NLP systems, offering a practical solution for researchers and practitioners, though it is incremental as it builds on existing fine-tuning and prompt optimization techniques.

The paper tackles the challenge of optimizing modular NLP pipelines like RAG, which lack intermediate labels for end-to-end optimization, by proposing a method that jointly optimizes both model weights and prompt templates. The result is an average performance improvement of up to 60% over weight-only optimization and 6% over prompt-only optimization across various tasks and language models.

Natural Language Processing (NLP) systems are increasingly taking the form of sophisticated modular pipelines, e.g., Retrieval Augmented Generation (RAG), where each module may involve a distinct Language Model (LM) and an associated prompt template. These compound systems often lack intermediate labels or gradient flow to optimize each module, making their end-to-end optimization challenging. Here we seek strategies to optimize both the module-level LM weights and the associated prompt templates of such systems to maximize a downstream task metric. We propose for the first time combining the weight and prompt optimization strategies to optimize a modular LM pipeline by alternating between the two to get the same LM to teach itself. In experiments with multi-hop QA, mathematical reasoning, and feature-based classification using mistral-7b, llama-2-7b, and llama-3-8b, these BetterTogether strategies optimizing the weights and prompts of a pipeline together outperform directly optimizing weights alone and prompts alone by up to 60% and 6%, respectively, on average across LMs and tasks. BetterTogether optimizer is released in DSPy at http://dspy.ai

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes