CLAIIRMay 14, 2021

QAConv: Question Answering on Informative Conversations

arXiv:2105.06912v2643 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This provides a new training and evaluation testbed for QA on conversations, addressing a domain-specific gap in handling informative dialogues like business emails and panel discussions.

The authors tackled the problem of question answering on long, complex informative conversations by introducing QAConv, a dataset of 34,608 QA pairs from 10,259 conversations, and found that state-of-the-art pretrained QA systems show limited zero-shot performance, often predicting questions as unanswerable.

This paper introduces QAConv, a new question answering (QA) dataset that uses conversations as a knowledge source. We focus on informative conversations, including business emails, panel discussions, and work channels. Unlike open-domain and task-oriented dialogues, these conversations are usually long, complex, asynchronous, and involve strong domain knowledge. In total, we collect 34,608 QA pairs from 10,259 selected conversations with both human-written and machine-generated questions. We use a question generator and a dialogue summarizer as auxiliary tools to collect and recommend questions. The dataset has two testing scenarios: chunk mode and full mode, depending on whether the grounded partial conversation is provided or retrieved. Experimental results show that state-of-the-art pretrained QA systems have limited zero-shot performance and tend to predict our questions as unanswerable. Our dataset provides a new training and evaluation testbed to facilitate QA on conversations research.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes