CL AI LGMay 24, 2023

Universal Self-Adaptive Prompting

Xingchen Wan, Ruoxi Sun, Hootan Nakhost, Hanjun Dai, Julian Martin Eisenschlos, Sercan O. Arik, Tomas Pfister

arXiv:2305.14926v222.4143 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of improving zero-shot learning in LLMs for NLP applications, offering a versatile automated solution, though it is incremental as it builds on existing in-context learning methods.

The paper tackles the problem of weak zero-shot performance in large language models due to lack of guidance, by introducing Universal Self-Adaptive Prompting (USP), an automatic prompt design method that uses unlabeled data and task categorization to select pseudo-demonstrations, resulting in performances stronger than standard zero-shot baselines and often comparable to or superior to few-shot baselines across over 40 tasks.

A hallmark of modern large language models (LLMs) is their impressive general zero-shot and few-shot abilities, often elicited through in-context learning (ICL) via prompting. However, while highly coveted and being the most general, zero-shot performances in LLMs are still typically weaker due to the lack of guidance and the difficulty of applying existing automatic prompt design methods in general tasks when ground-truth labels are unavailable. In this study, we address this by presenting Universal Self-Adaptive Prompting (USP), an automatic prompt design approach specifically tailored for zero-shot learning (while compatible with few-shot). Requiring only a small amount of unlabeled data and an inference-only LLM, USP is highly versatile: to achieve universal prompting, USP categorizes a possible NLP task into one of the three possible task types and then uses a corresponding selector to select the most suitable queries and zero-shot model-generated responses as pseudo-demonstrations, thereby generalizing ICL to the zero-shot setup in a fully automated way. We evaluate USP with PaLM and PaLM 2 models and demonstrate performances that are considerably stronger than standard zero-shot baselines and often comparable to or even superior to few-shot baselines across more than 40 natural language understanding, natural language generation, and reasoning tasks.

View on arXiv PDF

Similar