CLMar 31, 2017

Reading Wikipedia to Answer Open-Domain Questions

arXiv:1704.00051v22214 citations
Originality Incremental advance
AI Analysis

It addresses the problem of answering factoid questions from large-scale text sources for users needing quick information access, representing an incremental improvement by integrating existing techniques into a complete system.

The paper tackles open-domain question answering by using Wikipedia as the sole knowledge source, combining document retrieval with machine comprehension to find answer spans in articles, and shows that their system is highly competitive on multiple QA datasets.

This paper proposes to tackle open- domain question answering using Wikipedia as the unique knowledge source: the answer to any factoid question is a text span in a Wikipedia article. This task of machine reading at scale combines the challenges of document retrieval (finding the relevant articles) with that of machine comprehension of text (identifying the answer spans from those articles). Our approach combines a search component based on bigram hashing and TF-IDF matching with a multi-layer recurrent neural network model trained to detect answers in Wikipedia paragraphs. Our experiments on multiple existing QA datasets indicate that (1) both modules are highly competitive with respect to existing counterparts and (2) multitask learning using distant supervision on their combination is an effective complete system on this challenging task.

Code Implementations10 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes