CLJul 24, 2017

Analysing Errors of Open Information Extraction Systems

arXiv:1707.07499v11101 citations
Originality Synthesis-oriented
AI Analysis

This work provides a comprehensive error analysis for OIE systems, which is incremental as it builds on existing benchmarks and tools to guide improvements in information extraction.

The authors benchmarked four Open Information Extraction systems on datasets totaling 4522 sentences and 11243 relations, analyzing performance and the impact of five error classes on 749 n-ary tuples to identify key research directions for future OIE systems.

We report results on benchmarking Open Information Extraction (OIE) systems using RelVis, a toolkit for benchmarking Open Information Extraction systems. Our comprehensive benchmark contains three data sets from the news domain and one data set from Wikipedia with overall 4522 labeled sentences and 11243 binary or n-ary OIE relations. In our analysis on these data sets we compared the performance of four popular OIE systems, ClausIE, OpenIE 4.2, Stanford OpenIE and PredPatt. In addition, we evaluated the impact of five common error classes on a subset of 749 n-ary tuples. From our deep analysis we unreveal important research directions for a next generation of OIE systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes