CLAIAug 16, 2023

Answering Ambiguous Questions with a Database of Questions, Answers, and Revisions

arXiv:2308.08661v16 citationsh-index: 119
Originality Incremental advance
AI Analysis

This work addresses the challenge of handling ambiguous questions in open-domain QA, which is incremental as it builds on existing retrieval methods with a novel database approach.

The paper tackled the problem of answering ambiguous open-domain questions by using a database of unambiguous questions generated from Wikipedia, achieving a 15% relative improvement in recall and 10% in disambiguation on the ASQA benchmark.

Many open-domain questions are under-specified and thus have multiple possible answers, each of which is correct under a different interpretation of the question. Answering such ambiguous questions is challenging, as it requires retrieving and then reasoning about diverse information from multiple passages. We present a new state-of-the-art for answering ambiguous questions that exploits a database of unambiguous questions generated from Wikipedia. On the challenging ASQA benchmark, which requires generating long-form answers that summarize the multiple answers to an ambiguous question, our method improves performance by 15% (relative improvement) on recall measures and 10% on measures which evaluate disambiguating questions from predicted outputs. Retrieving from the database of generated questions also gives large improvements in diverse passage retrieval (by matching user questions q to passages p indirectly, via questions q' generated from p).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes