CLApr 8, 2020

Self-Attention Gazetteer Embeddings for Named-Entity Recognition

Stanislav Peshterliev, Christophe Dupuy, Imre Kiss

arXiv:2004.04060v21.08 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the problem of enhancing NER accuracy for NLP practitioners, but it is incremental as it builds on existing methods with modest gains.

The paper tackled improving named-entity recognition by integrating external knowledge from gazetteers, resulting in F1 score improvements from 92.34 to 92.86 on CoNLL-03 and from 89.11 to 89.32 on Ontonotes 5 datasets.

Recent attempts to ingest external knowledge into neural models for named-entity recognition (NER) have exhibited mixed results. In this work, we present GazSelfAttn, a novel gazetteer embedding approach that uses self-attention and match span encoding to build enhanced gazetteer embeddings. In addition, we demonstrate how to build gazetteer resources from the open source Wikidata knowledge base. Evaluations on CoNLL-03 and Ontonotes 5 datasets, show F1 improvements over baseline model from 92.34 to 92.86 and 89.11 to 89.32 respectively, achieving performance comparable to large state-of-the-art models.

View on arXiv PDF

Similar