Minimize Exposure Bias of Seq2Seq Models in Joint Entity and Relation Extraction
This addresses a specific problem in NLP for researchers and practitioners working on information extraction, offering an incremental improvement by modifying the decoding process to reduce bias.
The paper tackles exposure bias in Seq2Seq models for joint entity and relation extraction, which causes overfitting to frequent label combinations and reduces generalization; the proposed Seq2UMTree model limits decoding length and removes order among triplets, showing significantly better generalization on DuIE and NYT datasets compared to state-of-the-art Seq2Seq models.
Joint entity and relation extraction aims to extract relation triplets from plain text directly. Prior work leverages Sequence-to-Sequence (Seq2Seq) models for triplet sequence generation. However, Seq2Seq enforces an unnecessary order on the unordered triplets and involves a large decoding length associated with error accumulation. These introduce exposure bias, which may cause the models overfit to the frequent label combination, thus deteriorating the generalization. We propose a novel Sequence-to-Unordered-Multi-Tree (Seq2UMTree) model to minimize the effects of exposure bias by limiting the decoding length to three within a triplet and removing the order among triplets. We evaluate our model on two datasets, DuIE and NYT, and systematically study how exposure bias alters the performance of Seq2Seq models. Experiments show that the state-of-the-art Seq2Seq model overfits to both datasets while Seq2UMTree shows significantly better generalization. Our code is available at https://github.com/WindChimeRan/OpenJERE .