CLAINov 13, 2023

Semi-automatic Data Enhancement for Document-Level Relation Extraction with Distant Supervision from Large Language Models

arXiv:2311.07314v1140 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses the problem of reducing human effort in data annotation for document-level relation extraction, which is incremental as it builds on existing LLM capabilities.

The authors tackled the challenge of automating annotation for document-level relation extraction by integrating a large language model with a natural language inference module to generate relation triples, resulting in an enhanced dataset called DocGNRE that excels in re-annotating long-tail relation types.

Document-level Relation Extraction (DocRE), which aims to extract relations from a long context, is a critical challenge in achieving fine-grained structural comprehension and generating interpretable document representations. Inspired by recent advances in in-context learning capabilities emergent from large language models (LLMs), such as ChatGPT, we aim to design an automated annotation method for DocRE with minimum human effort. Unfortunately, vanilla in-context learning is infeasible for document-level relation extraction due to the plenty of predefined fine-grained relation types and the uncontrolled generations of LLMs. To tackle this issue, we propose a method integrating a large language model (LLM) and a natural language inference (NLI) module to generate relation triples, thereby augmenting document-level relation datasets. We demonstrate the effectiveness of our approach by introducing an enhanced dataset known as DocGNRE, which excels in re-annotating numerous long-tail relation types. We are confident that our method holds the potential for broader applications in domain-specific relation type definitions and offers tangible benefits in advancing generalized language semantic comprehension.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes