IRCLApr 25, 2022

Conversational Question Answering on Heterogeneous Sources

arXiv:2204.11677v252 citationsh-index: 96
Originality Incremental advance
AI Analysis

This addresses the challenge of implicit contexts in follow-up questions for users needing answers from diverse data sources, representing an incremental advance by integrating multiple sources rather than a paradigm shift.

The paper tackles the problem of conversational question answering over heterogeneous sources (knowledge bases, text, and tables) by introducing CONVINSE, an end-to-end pipeline that boosts answer coverage and confidence, as demonstrated on the new ConvMix benchmark with 3000 conversations and 16000 questions.

Conversational question answering (ConvQA) tackles sequential information needs where contexts in follow-up questions are left implicit. Current ConvQA systems operate over homogeneous sources of information: either a knowledge base (KB), or a text corpus, or a collection of tables. This paper addresses the novel issue of jointly tapping into all of these together, this way boosting answer coverage and confidence. We present CONVINSE, an end-to-end pipeline for ConvQA over heterogeneous sources, operating in three stages: i) learning an explicit structured representation of an incoming question and its conversational context, ii) harnessing this frame-like representation to uniformly capture relevant evidences from KB, text, and tables, and iii) running a fusion-in-decoder model to generate the answer. We construct and release the first benchmark, ConvMix, for ConvQA over heterogeneous sources, comprising 3000 real-user conversations with 16000 questions, along with entity annotations, completed question utterances, and question paraphrases. Experiments demonstrate the viability and advantages of our method, compared to state-of-the-art baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes