CLCROct 19, 2022

CEntRE: A paragraph-level Chinese dataset for Relation Extraction among Enterprises

arXiv:2210.10581v13 citationsh-index: 22
Originality Synthesis-oriented
AI Analysis

This addresses the problem of limited data for enterprise relation extraction, which is crucial for applications like risk analysis, but it is incremental as it focuses on dataset creation rather than method innovation.

The authors tackled the lack of datasets for enterprise relation extraction by introducing CEntRE, a new Chinese dataset from business news, and found that six state-of-the-art models performed poorly on it, highlighting its difficulty.

Enterprise relation extraction aims to detect pairs of enterprise entities and identify the business relations between them from unstructured or semi-structured text data, and it is crucial for several real-world applications such as risk analysis, rating research and supply chain security. However, previous work mainly focuses on getting attribute information about enterprises like personnel and corporate business, and pays little attention to enterprise relation extraction. To encourage further progress in the research, we introduce the CEntRE, a new dataset constructed from publicly available business news data with careful human annotation and intelligent data processing. Extensive experiments on CEntRE with six excellent models demonstrate the challenges of our proposed dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes