CLAIMay 5, 2023

Open Information Extraction via Chunks

arXiv:2305.03299v1133 citations
Originality Incremental advance
AI Analysis

This is an incremental improvement for natural language processing researchers working on information extraction.

The paper tackled the problem of Open Information Extraction (OIE) by proposing Sentence as Chunk sequence (SaC) instead of token sequences, and Chunk-OIE achieved state-of-the-art results on multiple datasets.

Open Information Extraction (OIE) aims to extract relational tuples from open-domain sentences. Existing OIE systems split a sentence into tokens and recognize token spans as tuple relations and arguments. We instead propose Sentence as Chunk sequence (SaC) and recognize chunk spans as tuple relations and arguments. We argue that SaC has better quantitative and qualitative properties for OIE than sentence as token sequence, and evaluate four choices of chunks (i.e., CoNLL chunks, simple phrases, NP chunks, and spans from SpanOIE) against gold OIE tuples. Accordingly, we propose a simple BERT-based model for sentence chunking, and propose Chunk-OIE for tuple extraction on top of SaC. Chunk-OIE achieves state-of-the-art results on multiple OIE datasets, showing that SaC benefits OIE task.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes