CL IRJun 15, 2020

On the Multi-Property Extraction and Beyond

Tomasz Dwojak, Michał Pietruszka, Łukasz Borchmann, Filip Graliński, Jakub Chłędowski

arXiv:2006.08281v10.2

Originality Incremental advance

AI Analysis

This work addresses information extraction and machine reading comprehension challenges for researchers and practitioners in natural language processing, though it appears incremental with a focus on dataset refinement and model adaptation.

The paper tackles the problem of multiple property extraction from text by proposing a Dual-source Transformer architecture that achieves state-of-the-art performance on the WikiReading dataset, with a significant performance improvement over previous methods. It also introduces WikiReading Recycled, a new public dataset designed to overcome limitations of the original while supporting multi-property extraction tasks.

In this paper, we investigate the Dual-source Transformer architecture on the WikiReading information extraction and machine reading comprehension dataset. The proposed model outperforms the current state-of-the-art by a large margin. Next, we introduce WikiReading Recycled - a newly developed public dataset, supporting the task of multiple property extraction. It keeps the spirit of the original WikiReading but does not inherit the identified disadvantages of its predecessor.

View on arXiv PDF

Similar