CLSep 26, 2019

Fine-tune Bert for DocRED with Two-step Process

arXiv:1909.11898v1131 citations
Originality Synthesis-oriented
AI Analysis

This work provides an incremental improvement for researchers in natural language processing by enhancing baselines for document-level relation extraction.

The authors tackled document-level relation extraction on the DocRED dataset by applying a pre-trained BERT model and a two-step process, achieving improved performance as a stronger baseline compared to existing BiLSTM methods.

Modelling relations between multiple entities has attracted increasing attention recently, and a new dataset called DocRED has been collected in order to accelerate the research on the document-level relation extraction. Current baselines for this task uses BiLSTM to encode the whole document and are trained from scratch. We argue that such simple baselines are not strong enough to model to complex interaction between entities. In this paper, we further apply a pre-trained language model (BERT) to provide a stronger baseline for this task. We also find that solving this task in phases can further improve the performance. The first step is to predict whether or not two entities have a relation, the second step is to predict the specific relation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes