CLIRLGJul 6, 2022

Strong Heuristics for Named Entity Linking

arXiv:2207.02824v1628 citationsh-index: 54
Originality Incremental advance
AI Analysis

This addresses the challenge of linking emerging entities in news for NLP applications, but it is incremental as it builds on existing heuristics.

The paper tackled the problem of named entity linking in news by proposing intuitive, lightweight, and scalable heuristics, achieving disambiguation rates of 94% on Quotebank and 63% on AIDA-CoNLL.

Named entity linking (NEL) in news is a challenging endeavour due to the frequency of unseen and emerging entities, which necessitates the use of unsupervised or zero-shot methods. However, such methods tend to come with caveats, such as no integration of suitable knowledge bases (like Wikidata) for emerging entities, a lack of scalability, and poor interpretability. Here, we consider person disambiguation in Quotebank, a massive corpus of speaker-attributed quotations from the news, and investigate the suitability of intuitive, lightweight, and scalable heuristics for NEL in web-scale corpora. Our best performing heuristic disambiguates 94% and 63% of the mentions on Quotebank and the AIDA-CoNLL benchmark, respectively. Additionally, the proposed heuristics compare favourably to the state-of-the-art unsupervised and zero-shot methods, Eigenthemes and mGENRE, respectively, thereby serving as strong baselines for unsupervised and zero-shot entity linking.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes