Entity and Evidence Guided Relation Extraction for DocRED
This work addresses the challenge of extracting relations from multi-sentence documents, which is important for natural language processing applications, but it appears incremental as it builds on existing pre-trained language models.
The paper tackles document-level relation extraction by proposing a joint training framework (E2GRE) that uses entity-guided sequences and attention probabilities to improve focus on relevant entities and evidence, achieving state-of-the-art results on the DocRED dataset.
Document-level relation extraction is a challenging task which requires reasoning over multiple sentences in order to predict relations in a document. In this paper, we pro-pose a joint training frameworkE2GRE(Entity and Evidence Guided Relation Extraction)for this task. First, we introduce entity-guided sequences as inputs to a pre-trained language model (e.g. BERT, RoBERTa). These entity-guided sequences help a pre-trained language model (LM) to focus on areas of the document related to the entity. Secondly, we guide the fine-tuning of the pre-trained language model by using its internal attention probabilities as additional features for evidence prediction.Our new approach encourages the pre-trained language model to focus on the entities and supporting/evidence sentences. We evaluate our E2GRE approach on DocRED, a recently released large-scale dataset for relation extraction. Our approach is able to achieve state-of-the-art results on the public leaderboard across all metrics, showing that our E2GRE is both effective and synergistic on relation extraction and evidence prediction.