IMoJIE: Iterative Memory-Based Joint Open Information Extraction
This work improves OpenIE systems for natural language processing applications by reducing redundancy and increasing extraction diversity, though it is incremental as it builds directly on an existing neural model.
The paper tackled the problem of Open Information Extraction by addressing the limitations of CopyAttention, which produced a constant number of redundant extractions per sentence, and introduced IMoJIE, an iterative memory-based model that generates variable and diverse extractions, achieving an 18 F1 point improvement over CopyAttention and a 2 F1 point gain over a BERT-based baseline to set a new state of the art.
While traditional systems for Open Information Extraction were statistical and rule-based, recently neural models have been introduced for the task. Our work builds upon CopyAttention, a sequence generation OpenIE model (Cui et. al., 2018). Our analysis reveals that CopyAttention produces a constant number of extractions per sentence, and its extracted tuples often express redundant information. We present IMoJIE, an extension to CopyAttention, which produces the next extraction conditioned on all previously extracted tuples. This approach overcomes both shortcomings of CopyAttention, resulting in a variable number of diverse extractions per sentence. We train IMoJIE on training data bootstrapped from extractions of several non-neural systems, which have been automatically filtered to reduce redundancy and noise. IMoJIE outperforms CopyAttention by about 18 F1 pts, and a BERT-based strong baseline by 2 F1 pts, establishing a new state of the art for the task.