CLAug 16, 2021

Partially Supervised Named Entity Recognition via the Expected Entity Ratio Loss

arXiv:2108.07216v1650 citations
Originality Highly original
AI Analysis

This addresses the issue of partially supervised NER for NLP practitioners by enabling effective learning with incomplete annotations, representing a strong specific gain rather than a broad paradigm shift.

The paper tackles the problem of learning named entity recognizers with missing entity annotations by proposing the Expected Entity Ratio loss, achieving significant performance gains such as +12.7 and +2.3 F1 score improvements over previous state-of-the-art methods in a challenging setting with only 1,000 biased annotations across 7 datasets.

We study learning named entity recognizers in the presence of missing entity annotations. We approach this setting as tagging with latent variables and propose a novel loss, the Expected Entity Ratio, to learn models in the presence of systematically missing tags. We show that our approach is both theoretically sound and empirically useful. Experimentally, we find that it meets or exceeds performance of strong and state-of-the-art baselines across a variety of languages, annotation scenarios, and amounts of labeled data. In particular, we find that it significantly outperforms the previous state-of-the-art methods from Mayhew et al. (2019) and Li et al. (2021) by +12.7 and +2.3 F1 score in a challenging setting with only 1,000 biased annotations, averaged across 7 datasets. We also show that, when combined with our approach, a novel sparse annotation scheme outperforms exhaustive annotation for modest annotation budgets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes