CLJun 2, 2020

Embeddings of Label Components for Sequence Labeling: A Case Study of Fine-grained Named Entity Recognition

arXiv:2006.01372v2999 citations
AI Analysis

This work addresses the challenge of handling low-frequency labels in fine-grained named entity recognition, which is important for improving accuracy in natural language processing tasks, though it is incremental as it builds on existing sequence labeling methods.

The paper tackled the problem of sequence labeling in fine-grained named entity recognition by integrating label component embeddings, such as span and type information, into models, resulting in improved performance, particularly for low-frequency labels, as demonstrated in experiments on English and Japanese datasets.

In general, the labels used in sequence labeling consist of different types of elements. For example, IOB-format entity labels, such as B-Person and I-Person, can be decomposed into span (B and I) and type information (Person). However, while most sequence labeling models do not consider such label components, the shared components across labels, such as Person, can be beneficial for label prediction. In this work, we propose to integrate label component information as embeddings into models. Through experiments on English and Japanese fine-grained named entity recognition, we demonstrate that the proposed method improves performance, especially for instances with low-frequency labels.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes