CL AISep 16, 2020

Tag and Correct: Question aware Open Information Extraction with Two-stage Decoding

Martin Kuo, Yaobo Liang, Lei Ji, Nan Duan, Linjun Shou, Ming Gong, Peng Chen

arXiv:2009.07406v10.2

Originality Incremental advance

AI Analysis

This work addresses the challenge of generating readable and falsifiable answer tuples from passages for question answering, offering an incremental improvement over existing generative approaches.

The paper tackles the problem of Question Aware Open Information Extraction by proposing a two-stage decoding model that tags keywords from a passage and then generates answers based on those tags, achieving a BLEU score of 59.32 on the WebAssertions dataset, outperforming previous generative methods.

Question Aware Open Information Extraction (Question aware Open IE) takes question and passage as inputs, outputting an answer tuple which contains a subject, a predicate, and one or more arguments. Each field of answer is a natural language word sequence and is extracted from the passage. The semi-structured answer has two advantages which are more readable and falsifiable compared to span answer. There are two approaches to solve this problem. One is an extractive method which extracts candidate answers from the passage with the Open IE model, and ranks them by matching with questions. It fully uses the passage information at the extraction step, but the extraction is independent to the question. The other one is the generative method which uses a sequence to sequence model to generate answers directly. It combines the question and passage as input at the same time, but it generates the answer from scratch, which does not use the facts that most of the answer words come from in the passage. To guide the generation by passage, we present a two-stage decoding model which contains a tagging decoder and a correction decoder. At the first stage, the tagging decoder will tag keywords from the passage. At the second stage, the correction decoder will generate answers based on tagged keywords. Our model could be trained end-to-end although it has two stages. Compared to previous generative models, we generate better answers by generating coarse to fine. We evaluate our model on WebAssertions (Yan et al., 2018) which is a Question aware Open IE dataset. Our model achieves a BLEU score of 59.32, which is better than previous generative methods.

View on arXiv PDF

Similar