CLMar 16, 2024

Rules still work for Open Information Extraction

arXiv:2403.10758v21 citationsh-index: 2Has Code
Originality Incremental advance
AI Analysis

This work addresses the problem of extracting relations from Chinese text for NLP applications, representing an incremental advance with domain-specific improvements.

The paper tackles open information extraction for Chinese text by introducing APRCOIE, a model that autonomously generates extraction patterns to handle diverse grammatical phenomena, and it outperforms state-of-the-art models with a manually annotated dataset.

Open information extraction (OIE) aims to extract surface relations and their corresponding arguments from natural language text, irrespective of domain. This paper presents an innovative OIE model, APRCOIE, tailored for Chinese text. Diverging from previous models, our model generates extraction patterns autonomously. The model defines a new pattern form for Chinese OIE and proposes an automated pattern generation methodology. In that way, the model can handle a wide array of complex and diverse Chinese grammatical phenomena. We design a preliminary filter based on tensor computing to conduct the extraction procedure efficiently. To train the model, we manually annotated a large-scale Chinese OIE dataset. In the comparative evaluation, we demonstrate that APRCOIE outperforms state-of-the-art Chinese OIE models and significantly expands the boundaries of achievable OIE performance. The code of APRCOIE and the annotated dataset are released on GitHub (https://github.com/jialin666/APRCOIE_v1)

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes