CLAug 15, 2023

Better Zero-Shot Reasoning with Role-Play Prompting

arXiv:2308.07702v2403 citationsh-index: 43Has Code
Originality Incremental advance
AI Analysis

This work addresses the underexplored influence of role-playing on reasoning abilities for users of LLMs, offering a novel prompting technique that outperforms standard methods, though it is incremental as it builds on existing prompting strategies.

The paper tackles the problem of enhancing zero-shot reasoning in large language models by introducing a role-play prompting method, which improves accuracy on benchmarks like AQuA from 53.5% to 63.8% and on Last Letter from 23.8% to 84.2%.

Modern large language models (LLMs) exhibit a remarkable capacity for role-playing, enabling them to embody not only human characters but also non-human entities. This versatility allows them to simulate complex human-like interactions and behaviors within various contexts, as well as to emulate specific objects or systems. While these capabilities have enhanced user engagement and introduced novel modes of interaction, the influence of role-playing on LLMs' reasoning abilities remains underexplored. In this study, we introduce a strategically designed role-play prompting methodology and assess its performance under the zero-shot setting across twelve diverse reasoning benchmarks. Our empirical results illustrate that role-play prompting consistently surpasses the standard zero-shot approach across most datasets. Notably, in experiments conducted using ChatGPT, accuracy on AQuA rises from 53.5% to 63.8%, and on Last Letter from 23.8% to 84.2%.Upon further comparison with the Zero-Shot-CoT technique, which prompts the model to "think step by step", our study demonstrates that role-play prompting acts as a more effective trigger for the CoT process. This highlights its potential to augment the reasoning capabilities of LLMs. We release our code at https://github.com/NKU-HLT/Role-Play-Prompting.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes