CL AIApr 1, 2024

Self-Demos: Eliciting Out-of-Demonstration Generalizability in Large Language Models

Wei He, Shichun Liu, Jun Zhao, Yiwen Ding, Yi Lu, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang

arXiv:2404.00884v116.234 citationsh-index: 40Has CodeNAACL-HLT

Originality Incremental advance

AI Analysis

This addresses the limitation of few-shot learning in LLMs for real-world applications where specific demos are scarce, though it appears incremental as it builds on existing prompting methods.

The paper tackles the problem of large language models struggling with out-of-demonstration queries due to reliance on high-quality, query-specific demos, proposing Self-Demos, a prompting method that generates query-aware demos to transform OOD queries to in-distribution, and shows it outperforms state-of-the-art baselines on a custom dataset and two public math benchmarks.

Large language models (LLMs) have shown promising abilities of in-context learning (ICL), adapting swiftly to new tasks with only few-shot demonstrations. However, current few-shot methods heavily depend on high-quality, query-specific demos, which are often lacking. When faced with out-of-demonstration (OOD) queries, methods that rely on hand-crafted demos or external retrievers might fail. To bridge the gap between limited demos and OOD queries, we propose Self-Demos, a novel prompting method that elicits the inherent generalizability in LLMs by query-aware demo generation. The generated demos strategically interpolate between existing demos and the given query, transforming the query from OOD to ID. To evaluate the effectiveness of our approach, we manually constructed OOD-Toolset, a dataset in the tool-using scenario with over 300 real-world APIs and 1000 instances, each consisting of three tool-use cases as demos and an OOD query. Thorough experiments on our dataset and two public math benchmarks have shown that our method can outperform state-of-the-art baselines in the OOD setting. Moreover, we conduct a range of analyses to validate Self-Demos's generalization and provide more insights.

View on arXiv PDF Code

Similar