CLApr 8, 2020

Self-Attention Gazetteer Embeddings for Named-Entity Recognition

arXiv:2004.04060v28 citationsHas Code
AI Analysis

This work addresses the problem of enhancing NER accuracy for NLP practitioners, but it is incremental as it builds on existing methods with modest gains.

The paper tackled improving named-entity recognition by integrating external knowledge from gazetteers, resulting in F1 score improvements from 92.34 to 92.86 on CoNLL-03 and from 89.11 to 89.32 on Ontonotes 5 datasets.

Recent attempts to ingest external knowledge into neural models for named-entity recognition (NER) have exhibited mixed results. In this work, we present GazSelfAttn, a novel gazetteer embedding approach that uses self-attention and match span encoding to build enhanced gazetteer embeddings. In addition, we demonstrate how to build gazetteer resources from the open source Wikidata knowledge base. Evaluations on CoNLL-03 and Ontonotes 5 datasets, show F1 improvements over baseline model from 92.34 to 92.86 and 89.11 to 89.32 respectively, achieving performance comparable to large state-of-the-art models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes