CLMar 15, 2023

UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation

MicrosoftPeking U
arXiv:2303.08518v4162 citationsh-index: 102Has Code
Originality Incremental advance
AI Analysis

This addresses the generalization challenge in LLMs for users needing efficient zero-shot evaluation, though it is incremental as it builds on retrieval-based prompt methods.

The paper tackles the problem of LLMs requiring model-specific fine-tuning or task-specific prompt engineering for generalization by proposing UPRISE, a lightweight retriever that automatically retrieves prompts for zero-shot tasks, achieving universality across unseen tasks and larger LLMs like BLOOM-7.1B, OPT-66B, and GPT3-175B, and mitigating hallucination in ChatGPT.

Large Language Models (LLMs) are popular for their impressive abilities, but the need for model-specific fine-tuning or task-specific prompt engineering can hinder their generalization. We propose UPRISE (Universal Prompt Retrieval for Improving zero-Shot Evaluation), which tunes a lightweight and versatile retriever that automatically retrieves prompts for a given zero-shot task input. Specifically, we demonstrate universality in a cross-task and cross-model scenario: the retriever is tuned on a diverse set of tasks, but tested on unseen task types; we use a small frozen LLM, GPT-Neo-2.7B, for tuning the retriever, but test the retriever on different LLMs of much larger scales, such as BLOOM-7.1B, OPT-66B and GPT3-175B. Additionally, we show that UPRISE mitigates the hallucination problem in our experiments with ChatGPT, suggesting its potential to improve even the strongest LLMs. Our model and code are available at https://github.com/microsoft/LMOps.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes