Unsupervised Candidate Answer Extraction through Differentiable Masker-Reconstructor Model
This addresses the need for more generalizable and data-efficient answer extraction in question generation systems, though it is incremental as it builds on existing unsupervised approaches.
The paper tackles the problem of candidate answer extraction for question generation by proposing an unsupervised Differentiable Masker-Reconstructor (DMR) model that leverages context structure and self-consistency, achieving performance superior to other unsupervised methods and comparable to supervised ones.
Question generation is a widely used data augmentation approach with extensive applications, and extracting qualified candidate answers from context passages is a critical step for most question generation systems. However, existing methods for candidate answer extraction are reliant on linguistic rules or annotated data that face the partial annotation issue and challenges in generalization. To overcome these limitations, we propose a novel unsupervised candidate answer extraction approach that leverages the inherent structure of context passages through a Differentiable Masker-Reconstructor (DMR) Model with the enforcement of self-consistency for picking up salient information tokens. We curated two datasets with exhaustively-annotated answers and benchmark a comprehensive set of supervised and unsupervised candidate answer extraction methods. We demonstrate the effectiveness of the DMR model by showing its performance is superior among unsupervised methods and comparable to supervised methods.