CLAILGSep 10, 2021

Generating Self-Contained and Summary-Centric Question Answer Pairs via Differentiable Reward Imitation Learning

arXiv:2109.04689v1661 citations
Originality Incremental advance
AI Analysis

This work addresses a domain-specific problem for conversational news recommendation systems, offering an incremental improvement by mitigating exposure bias with a differentiable reward function.

The paper tackles generating question-answer pairs from news articles, focusing on self-contained, summary-centric questions and length-constrained answers, and achieves high answer accuracy as shown by automatic metrics and human evaluation.

Motivated by suggested question generation in conversational news recommendation systems, we propose a model for generating question-answer pairs (QA pairs) with self-contained, summary-centric questions and length-constrained, article-summarizing answers. We begin by collecting a new dataset of news articles with questions as titles and pairing them with summaries of varying length. This dataset is used to learn a QA pair generation model producing summaries as answers that balance brevity with sufficiency jointly with their corresponding questions. We then reinforce the QA pair generation process with a differentiable reward function to mitigate exposure bias, a common problem in natural language generation. Both automatic metrics and human evaluation demonstrate these QA pairs successfully capture the central gists of the articles and achieve high answer accuracy.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes