CLFeb 27, 2024

Unsupervised multiple choices question answering via universal corpus

arXiv:2402.17333v12 citationsh-index: 3ICASSP
Originality Incremental advance
AI Analysis

This addresses the problem of reducing annotation burden for question answering in new domains, though it appears incremental as it builds on existing unsupervised methods.

The paper tackles unsupervised multiple-choice question answering by generating synthetic data from universal domain contexts without manual annotation, using named entities and knowledge graphs to create distractors, and demonstrates effectiveness on multiple datasets.

Unsupervised question answering is a promising yet challenging task, which alleviates the burden of building large-scale annotated data in a new domain. It motivates us to study the unsupervised multiple-choice question answering (MCQA) problem. In this paper, we propose a novel framework designed to generate synthetic MCQA data barely based on contexts from the universal domain without relying on any form of manual annotation. Possible answers are extracted and used to produce related questions, then we leverage both named entities (NE) and knowledge graphs to discover plausible distractors to form complete synthetic samples. Experiments on multiple MCQA datasets demonstrate the effectiveness of our method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes