CLNov 10, 2022

Not Just Plain Text! Fuel Document-Level Relation Extraction with Explicit Syntax Refinement and Subsentence Modeling

Zhichao Duan, Xiuxing Li, Zhenyu Li, Zhuo Wang, Jianyong Wang

arXiv:2211.05343v224.0292 citationsh-index: 10

Originality Incremental advance

AI Analysis

This work addresses document-level relation extraction, a key task in natural language processing for applications like information retrieval, but it appears incremental as it builds on existing methods by incorporating syntactic information.

The paper tackles the challenge of identifying semantic relationships between entities in long documents by proposing LARSON, a framework that uses explicit syntax refinement and subsentence modeling to focus on informative text portions, achieving significant performance improvements on benchmark datasets like DocRED, CDR, and GDA.

Document-level relation extraction (DocRE) aims to identify semantic labels among entities within a single document. One major challenge of DocRE is to dig decisive details regarding a specific entity pair from long text. However, in many cases, only a fraction of text carries required information, even in the manually labeled supporting evidence. To better capture and exploit instructive information, we propose a novel expLicit syntAx Refinement and Subsentence mOdeliNg based framework (LARSON). By introducing extra syntactic information, LARSON can model subsentences of arbitrary granularity and efficiently screen instructive ones. Moreover, we incorporate refined syntax into text representations which further improves the performance of LARSON. Experimental results on three benchmark datasets (DocRED, CDR, and GDA) demonstrate that LARSON significantly outperforms existing methods.

View on arXiv PDF

Similar