CLMay 6, 2025

Evaluation of LLMs on Long-tail Entity Linking in Historical Documents

arXiv:2505.03473v12 citationsh-index: 2
Originality Synthesis-oriented
AI Analysis

This addresses the understudied challenge of long-tail entity linking for NLP applications in historical domains, but it is incremental as it evaluates existing LLMs on new data.

The paper tackled the problem of linking long-tail entities in historical documents using LLMs, finding that GPT and LLama3 performed encouragingly well compared to a state-of-the-art framework, though no concrete numbers were provided.

Entity Linking (EL) plays a crucial role in Natural Language Processing (NLP) applications, enabling the disambiguation of entity mentions by linking them to their corresponding entries in a reference knowledge base (KB). Thanks to their deep contextual understanding capabilities, LLMs offer a new perspective to tackle EL, promising better results than traditional methods. Despite the impressive generalization capabilities of LLMs, linking less popular, long-tail entities remains challenging as these entities are often underrepresented in training data and knowledge bases. Furthermore, the long-tail EL task is an understudied problem, and limited studies address it with LLMs. In the present work, we assess the performance of two popular LLMs, GPT and LLama3, in a long-tail entity linking scenario. Using MHERCL v0.1, a manually annotated benchmark of sentences from domain-specific historical texts, we quantitatively compare the performance of LLMs in identifying and linking entities to their corresponding Wikidata entries against that of ReLiK, a state-of-the-art Entity Linking and Relation Extraction framework. Our preliminary experiments reveal that LLMs perform encouragingly well in long-tail EL, indicating that this technology can be a valuable adjunct in filling the gap between head and long-tail EL.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes