Beyond Fine-Tuning: In-Context Learning and Chain-of-Thought for Reasoned Distractor Generation
For educators and test developers, this work automates the creation of plausible distractors that mimic expert reasoning, reducing reliance on domain experts.
The paper addresses distractor generation for multiple-choice questions, proposing a rationale-augmented framework using in-context learning and chain-of-thought reasoning with LLMs. It achieves state-of-the-art results across six benchmarks, outperforming recent fine-tuning-based methods.
Distractor generation (DG) remains a labor-intensive task that still significantly depends on domain experts. The task focuses on generating plausible yet incorrect options, known as distractors, for multiple-choice questions. A reliable distractor must be contextually relevant to the question and able to mislead examinees through implicit reasoning when identifying the correct answer. While a recent method integrates fine-tuning pre-trained encoder-decoder models with contrastive learning to generate semantically relevant distractors for a given question-answer, it often fails to capture the underlying reasoning process that experts utilize when selecting distractors in benchmarks. In this paper, we explore large language models (LLMs) reasoning for DG through in-context learning with unsupervised semantic retrieval for selecting few-shot examples. We design a rationale-augmented DG framework that jointly generates distractors and their rationales for a given question-answer. Extensive experiments on six benchmarks, with varying average distractor lengths and domains, demonstrate that prompting LLMs with few-shot examples substantially improves the performance compared to recent DG models. It outperforms recent approaches and achieves state-of-the-art results in generating reasoned distractors that align with human-labeled benchmarks.