CL AISep 23, 2022

Multiple-Choice Question Generation: Towards an Automated Assessment Framework

arXiv:2209.11830v15.049 citationsh-index: 61

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of scalable and cost-effective assessment for automated multiple-choice question generation in English comprehension, which is incremental as it builds on existing transformer-based methods.

The authors tackled the challenge of evaluating automated multiple-choice question generation systems by proposing a set of performance criteria, including grammatical correctness, answerability, diversity, and complexity, and described initial systems for each metric evaluated on standard corpora.

Automated question generation is an important approach to enable personalisation of English comprehension assessment. Recently, transformer-based pretrained language models have demonstrated the ability to produce appropriate questions from a context paragraph. Typically, these systems are evaluated against a reference set of manually generated questions using n-gram based metrics, or manual qualitative assessment. Here, we focus on a fully automated multiple-choice question generation (MCQG) system where both the question and possible answers must be generated from the context paragraph. Applying n-gram based approaches is challenging for this form of system as the reference set is unlikely to capture the full range of possible questions and answer options. Conversely manual assessment scales poorly and is expensive for MCQG system development. In this work, we propose a set of performance criteria that assess different aspects of the generated multiple-choice questions of interest. These qualities include: grammatical correctness, answerability, diversity and complexity. Initial systems for each of these metrics are described, and individually evaluated on standard multiple-choice reading comprehension corpora.

View on arXiv PDF

Similar