CLJun 2, 2020

Open-Domain Question Answering with Pre-Constructed Question Spaces

Jinfeng Xiao, Lidan Wang, Franck Dernoncourt, Trung Bui, Tong Sun, Jiawei Han

arXiv:2006.08337v227.9730 citations

Originality Incremental advance

AI Analysis

This addresses bottlenecks in existing QA systems for users needing accurate answers from large document collections, though it appears incremental as it builds on prior families of solutions.

The paper tackles open-domain question answering by proposing a novel reader-retriever algorithm that pre-constructs question spaces offline and combines results with retriever-reader methods, achieving superior accuracy on real-world datasets.

Open-domain question answering aims at solving the task of locating the answers to user-generated questions in massive collections of documents. There are two families of solutions available: retriever-readers, and knowledge-graph-based approaches. A retriever-reader usually first uses information retrieval methods like TF-IDF to locate some documents or paragraphs that are likely to be relevant to the question, and then feeds the retrieved text to a neural network reader to extract the answer. Alternatively, knowledge graphs can be constructed from the corpus and be queried against to answer user questions. We propose a novel algorithm with a reader-retriever structure that differs from both families. Our reader-retriever first uses an offline reader to read the corpus and generate collections of all answerable questions associated with their answers, and then uses an online retriever to respond to user queries by searching the pre-constructed question spaces for answers that are most likely to be asked in the given way. We further combine retriever-reader and reader-retriever results into one single answer by examining the consistency between the two components. We claim that our algorithm solves some bottlenecks in existing work, and demonstrate that it achieves superior accuracy on real-world datasets.

View on arXiv PDF

Similar