CLLGJan 22, 2020

Contextualized Embeddings in Named-Entity Recognition: An Empirical Study on Generalization

arXiv:2001.08053v128 citations
AI Analysis

This addresses generalization issues in NER for researchers, but it is incremental as it focuses on empirical validation of existing methods.

The paper tackles the problem of overestimating lexical features in Named-Entity Recognition benchmarks by empirically analyzing contextualized embeddings' generalization, showing they improve unseen mention detection with a +1.2% in-domain and +13% out-of-domain micro-F1 score increase.

Contextualized embeddings use unsupervised language model pretraining to compute word representations depending on their context. This is intuitively useful for generalization, especially in Named-Entity Recognition where it is crucial to detect mentions never seen during training. However, standard English benchmarks overestimate the importance of lexical over contextual features because of an unrealistic lexical overlap between train and test mentions. In this paper, we perform an empirical analysis of the generalization capabilities of state-of-the-art contextualized embeddings by separating mentions by novelty and with out-of-domain evaluation. We show that they are particularly beneficial for unseen mentions detection, especially out-of-domain. For models trained on CoNLL03, language model contextualization leads to a +1.2% maximal relative micro-F1 score increase in-domain against +13% out-of-domain on the WNUT dataset

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes