CLMay 25, 2023

Learn to Not Link: Exploring NIL Prediction in Entity Linking

arXiv:2305.15725v1224 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This addresses a specific gap in entity linking for NLP researchers, but it is incremental as it focuses on dataset creation and evaluation of existing models.

The paper tackles the NIL prediction problem in entity linking, where mentions lack corresponding entities in a knowledge base, by introducing a new dataset NEL and showing that training with NIL mentions significantly improves prediction accuracy.

Entity linking models have achieved significant success via utilizing pretrained language models to capture semantic features. However, the NIL prediction problem, which aims to identify mentions without a corresponding entity in the knowledge base, has received insufficient attention. We categorize mentions linking to NIL into Missing Entity and Non-Entity Phrase, and propose an entity linking dataset NEL that focuses on the NIL prediction problem. NEL takes ambiguous entities as seeds, collects relevant mention context in the Wikipedia corpus, and ensures the presence of mentions linking to NIL by human annotation and entity masking. We conduct a series of experiments with the widely used bi-encoder and cross-encoder entity linking models, results show that both types of NIL mentions in training data have a significant influence on the accuracy of NIL prediction. Our code and dataset can be accessed at https://github.com/solitaryzero/NIL_EL

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes