CL AI LGJan 16, 2024

Large Language Models are Null-Shot Learners

Pittawat Taveekitworachai, Febri Abdullah, Ruck Thawonmas

arXiv:2401.08273v31.93 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of hallucination in LLMs for AI researchers and practitioners, offering a method to potentially enhance performance in tasks where models still hallucinate, though it is incremental as it builds on existing prompting techniques.

The paper tackles the problem of hallucination in large language models by proposing null-shot prompting, which exploits hallucination to improve task performance over standard zero-shot prompting, showing performance gains across eight LLMs on datasets like reading comprehension and arithmetic reasoning.

This paper presents null-shot prompting. Null-shot prompting exploits hallucination in large language models (LLMs) by instructing LLMs to utilize information from the "Examples" section that never exists within the provided context to perform a task. While reducing hallucination is crucial and non-negligible for daily and critical uses of LLMs, we propose that in the current landscape in which these LLMs still hallucinate, it is possible, in fact, to exploit hallucination to increase performance in performing tasks compared to standard zero-shot prompting. Experiments with eight LLMs show improvements in performance across the majority of eight datasets, including reading comprehension, arithmetic reasoning, and closed-book question answering. The observed inconsistency in increased relative performance across the LLMs also potentially indicates a different degree of inherent hallucination in each model. These differences show that it is possible to utilize null-shot prompting as a way to detect degrees of hallucination in LLMs using existing benchmarking datasets. We also perform ablation studies, including experimenting with a modified version of null-shot prompting that incorporates ideas from zero-shot chain-of-thought prompting, which shows different trends of results.

View on arXiv PDF

Similar