Question Answering from Unstructured Text by Retrieval and Comprehension
This work addresses the problem of extracting answers from unstructured sources like Wikipedia for users needing efficient QA systems, representing a strong specific gain.
The paper tackles open-domain question answering from unstructured text by proposing a two-step retrieval and comprehension approach, achieving a 40% error reduction on the WIKIMOVIES dataset.
Open domain Question Answering (QA) systems must interact with external knowledge sources, such as web pages, to find relevant information. Information sources like Wikipedia, however, are not well structured and difficult to utilize in comparison with Knowledge Bases (KBs). In this work we present a two-step approach to question answering from unstructured text, consisting of a retrieval step and a comprehension step. For comprehension, we present an RNN based attention model with a novel mixture mechanism for selecting answers from either retrieved articles or a fixed vocabulary. For retrieval we introduce a hand-crafted model and a neural model for ranking relevant articles. We achieve state-of-the-art performance on W IKI M OVIES dataset, reducing the error by 40%. Our experimental results further demonstrate the importance of each of the introduced components.