CLJun 28, 2017

Named Entity Disambiguation for Noisy Text

arXiv:1706.09147v21121 citations
Originality Highly original
AI Analysis

This work addresses entity disambiguation for noisy web text, which is an incremental improvement over existing news-based approaches.

The authors tackled Named Entity Disambiguation in noisy text by introducing WikilinksNED, a challenging dataset, and a neural model with improved embeddings and negative sampling, achieving significant performance gains on this dataset while matching existing methods on newswire data.

We address the task of Named Entity Disambiguation (NED) for noisy text. We present WikilinksNED, a large-scale NED dataset of text fragments from the web, which is significantly noisier and more challenging than existing news-based datasets. To capture the limited and noisy local context surrounding each mention, we design a neural model and train it with a novel method for sampling informative negative examples. We also describe a new way of initializing word and entity embeddings that significantly improves performance. Our model significantly outperforms existing state-of-the-art methods on WikilinksNED while achieving comparable performance on a smaller newswire dataset.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes