CLApr 20, 2018

Phrase-Indexed Question Answering: A New Challenge for Scalable Document Comprehension

Minjoon Seo, Tom Kwiatkowski, Ankur P. Parikh, Ali Farhadi, Hannaneh Hajishirzi

arXiv:1804.07726v232.31122 citationsHas Code

Originality Incremental advance

AI Analysis

This work proposes a new benchmark for the QA research community to improve scalable document comprehension, though it is incremental as it builds on existing QA frameworks.

The paper introduces Phrase-Indexed Question Answering (PIQA), a modular QA task that enforces independence between document and question encoders to address scalability challenges in machine comprehension, achieving reasonable baseline accuracy but underperforming unconstrained models.

We formalize a new modular variant of current question answering tasks by enforcing complete independence of the document encoder from the question encoder. This formulation addresses a key challenge in machine comprehension by requiring a standalone representation of the document discourse. It additionally leads to a significant scalability advantage since the encoding of the answer candidate phrases in the document can be pre-computed and indexed offline for efficient retrieval. We experiment with baseline models for the new task, which achieve a reasonable accuracy but significantly underperform unconstrained QA models. We invite the QA research community to engage in Phrase-Indexed Question Answering (PIQA, pika) for closing the gap. The leaderboard is at: nlp.cs.washington.edu/piqa

View on arXiv PDF Code

Similar