CL AIOct 18, 2023

Enhancing Low-resource Fine-grained Named Entity Recognition by Leveraging Coarse-grained Datasets

arXiv:2310.11715v221.1131 citationsh-index: 21Has Code

Originality Incremental advance

AI Analysis

This addresses data scarcity in fine-grained NER, an incremental advance for natural language processing applications.

The paper tackles the problem of insufficient labeled data in fine-grained Named Entity Recognition by leveraging coarse-grained datasets, achieving performance improvements over K-shot and supervised learning methods with limited fine-grained annotations.

Named Entity Recognition (NER) frequently suffers from the problem of insufficient labeled data, particularly in fine-grained NER scenarios. Although $K$-shot learning techniques can be applied, their performance tends to saturate when the number of annotations exceeds several tens of labels. To overcome this problem, we utilize existing coarse-grained datasets that offer a large number of annotations. A straightforward approach to address this problem is pre-finetuning, which employs coarse-grained data for representation learning. However, it cannot directly utilize the relationships between fine-grained and coarse-grained entities, although a fine-grained entity type is likely to be a subcategory of a coarse-grained entity type. We propose a fine-grained NER model with a Fine-to-Coarse(F2C) mapping matrix to leverage the hierarchical structure explicitly. In addition, we present an inconsistency filtering method to eliminate coarse-grained entities that are inconsistent with fine-grained entity types to avoid performance degradation. Our experimental results show that our method outperforms both $K$-shot learning and supervised learning methods when dealing with a small number of fine-grained annotations.

View on arXiv PDF Code

Similar