Graphene: Semantically-Linked Propositions in Open Information Extraction
This work addresses the challenge of handling complex linguistic structures in Open IE, which is important for natural language processing applications, but it appears incremental as it builds on existing Open IE approaches with a novel transformation method.
The paper tackles the problem of extracting structured information from complex sentences in Open Information Extraction by introducing a two-layered transformation stage with clausal and phrasal disembedding and rhetorical relation identification, resulting in a system called Graphene that outperforms state-of-the-art Open IE systems in constructing correct n-ary predicate-argument structures.
We present an Open Information Extraction (IE) approach that uses a two-layered transformation stage consisting of a clausal disembedding layer and a phrasal disembedding layer, together with rhetorical relation identification. In that way, we convert sentences that present a complex linguistic structure into simplified, syntactically sound sentences, from which we can extract propositions that are represented in a two-layered hierarchy in the form of core relational tuples and accompanying contextual information which are semantically linked via rhetorical relations. In a comparative evaluation, we demonstrate that our reference implementation Graphene outperforms state-of-the-art Open IE systems in the construction of correct n-ary predicate-argument structures. Moreover, we show that existing Open IE approaches can benefit from the transformation process of our framework.