CLIRJan 20, 2024

Exploiting Duality in Open Information Extraction with Predicate Prompt

arXiv:2401.11107v13 citationsWSDM
Originality Incremental advance
AI Analysis

This work addresses the problem of improving information extraction for applications like search systems, though it appears incremental as it builds on existing generative methods with a novel dual-task approach.

The paper tackles the challenge of extracting multiple complicated triplets in open information extraction by proposing DualOIE, a generative model that performs a dual task of extracting triplets and converting them back into sentences, achieving state-of-the-art performance on benchmarks and improving QV-CTR by 0.93% and UV-CTR by 0.56% in an online search system.

Open information extraction (OpenIE) aims to extract the schema-free triplets in the form of (\emph{subject}, \emph{predicate}, \emph{object}) from a given sentence. Compared with general information extraction (IE), OpenIE poses more challenges for the IE models, {especially when multiple complicated triplets exist in a sentence. To extract these complicated triplets more effectively, in this paper we propose a novel generative OpenIE model, namely \emph{DualOIE}, which achieves a dual task at the same time as extracting some triplets from the sentence, i.e., converting the triplets into the sentence.} Such dual task encourages the model to correctly recognize the structure of the given sentence and thus is helpful to extract all potential triplets from the sentence. Specifically, DualOIE extracts the triplets in two steps: 1) first extracting a sequence of all potential predicates, 2) then using the predicate sequence as a prompt to induce the generation of triplets. Our experiments on two benchmarks and our dataset constructed from Meituan demonstrate that DualOIE achieves the best performance among the state-of-the-art baselines. Furthermore, the online A/B test on Meituan platform shows that 0.93\% improvement of QV-CTR and 0.56\% improvement of UV-CTR have been obtained when the triplets extracted by DualOIE were leveraged in Meituan's search system.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes