CLLGSep 1, 2019

Global Entity Disambiguation with BERT

arXiv:1909.00426v5650 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses entity disambiguation for natural language processing applications, representing an incremental improvement over existing methods.

The paper tackles global entity disambiguation by proposing a BERT-based model that treats entities as input tokens and resolves mentions sequentially, achieving new state-of-the-art results on five standard datasets including AIDA-CoNLL and MSNBC.

We propose a global entity disambiguation (ED) model based on BERT. To capture global contextual information for ED, our model treats not only words but also entities as input tokens, and solves the task by sequentially resolving mentions to their referent entities and using resolved entities as inputs at each step. We train the model using a large entity-annotated corpus obtained from Wikipedia. We achieve new state-of-the-art results on five standard ED datasets: AIDA-CoNLL, MSNBC, AQUAINT, ACE2004, and WNED-WIKI. The source code and model checkpoint are available at https://github.com/studio-ousia/luke.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes