CLJul 4, 2024

Question-Analysis Prompting Improves LLM Performance in Reasoning Tasks

Dharunish Yugeswardeenoo, Kevin Zhu, Sean O'Brien

arXiv:2407.03624v217.130 citationsh-index: 5

Originality Incremental advance

AI Analysis

This addresses the challenge of improving LLM reasoning for applications in fields like education and AI assistants, though it is an incremental advance over existing prompting methods.

The paper tackles the problem of LLMs underperforming in reasoning tasks by proposing Question Analysis Prompting (QAP), a novel prompting strategy where the model explains the question before solving it, which outperforms state-of-the-art prompts on datasets like AQuA and SAT, achieving top-2 rankings in 75% of tests.

Although LLMs have the potential to transform many fields, they still underperform humans in reasoning tasks. Existing methods induce the model to produce step-by-step calculations, but this research explores the question: Does making the LLM analyze the question improve its performance? We propose a novel prompting strategy called Question Analysis Prompting (QAP), in which the model is prompted to explain the question in $n$ words before solving. The value of $n$ influences the length of response generated by the model. QAP is evaluated on GPT 3.5 Turbo and GPT 4 Turbo on arithmetic datasets GSM8K, AQuA, and SAT and commonsense dataset StrategyQA. QAP is compared with other state-of-the-art prompts including Chain-of-Thought (CoT), Plan and Solve Prompting (PS+) and Take A Deep Breath (TADB). QAP outperforms all state-of-the-art prompts on AQuA and SAT datasets on both GPT3.5 and GPT4. QAP consistently ranks among the top-2 prompts on 75\% of the tests. A key factor of QAP performance can be attributed to response length, where detailed responses are beneficial when answering harder questions, but can negatively affect easy questions.

View on arXiv PDF

Similar