More Bias, Less Bias: BiasPrompting for Enhanced Multiple-Choice Question Answering
This addresses a key limitation in LLM performance on multiple-choice tasks, offering a novel inference framework for enhanced reasoning, though it is incremental as it builds on existing prompting methods.
The paper tackles the problem of incomplete exploration of answer choices in multiple-choice question answering by large language models, introducing BiasPrompting to generate and evaluate reasoning for all options, resulting in significant improvements across five benchmarks.
With the advancement of large language models (LLMs), their performance on multiple-choice question (MCQ) tasks has improved significantly. However, existing approaches face key limitations: answer choices are typically presented to LLMs without contextual grounding or explanation. This absence of context can lead to incomplete exploration of all possible answers, ultimately degrading the models' reasoning capabilities. To address these challenges, we introduce BiasPrompting, a novel inference framework that guides LLMs to generate and critically evaluate reasoning across all plausible answer options before reaching a final prediction. It consists of two components: first, a reasoning generation stage, where the model is prompted to produce supportive reasonings for each answer option, and then, a reasoning-guided agreement stage, where the generated reasonings are synthesized to select the most plausible answer. Through comprehensive evaluations, BiasPrompting demonstrates significant improvements in five widely used multiple-choice question answering benchmarks. Our experiments showcase that BiasPrompting enhances the reasoning capabilities of LLMs and provides a strong foundation for tackling complex and challenging questions, particularly in settings where existing methods underperform.