CLSep 16, 2021

Improving Unsupervised Question Answering via Summarization-Informed Question Generation

arXiv:2109.07954v1667 citations
Originality Incremental advance
AI Analysis

This work addresses the need for scalable and domain-agnostic training data in question answering, offering an incremental improvement over existing unsupervised methods.

The paper tackles the problem of generating training data for unsupervised question answering by proposing a method that uses heuristically generated questions from summaries to train a question generation model, which then creates synthetic QA pairs. The result is that a QA model trained with only 20k synthetic pairs substantially outperforms previous unsupervised models on six datasets, including in-domain and out-of-domain ones.

Question Generation (QG) is the task of generating a plausible question for a given <passage, answer> pair. Template-based QG uses linguistically-informed heuristics to transform declarative sentences into interrogatives, whereas supervised QG uses existing Question Answering (QA) datasets to train a system to generate a question given a passage and an answer. A disadvantage of the heuristic approach is that the generated questions are heavily tied to their declarative counterparts. A disadvantage of the supervised approach is that they are heavily tied to the domain/language of the QA dataset used as training data. In order to overcome these shortcomings, we propose an unsupervised QG method which uses questions generated heuristically from summaries as a source of training data for a QG system. We make use of freely available news summary data, transforming declarative summary sentences into appropriate questions using heuristics informed by dependency parsing, named entity recognition and semantic role labeling. The resulting questions are then combined with the original news articles to train an end-to-end neural QG model. We extrinsically evaluate our approach using unsupervised QA: our QG model is used to generate synthetic QA pairs for training a QA model. Experimental results show that, trained with only 20k English Wikipedia-based synthetic QA pairs, the QA model substantially outperforms previous unsupervised models on three in-domain datasets (SQuAD1.1, Natural Questions, TriviaQA) and three out-of-domain datasets (NewsQA, BioASQ, DuoRC), demonstrating the transferability of the approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes