CLFeb 13, 2025

Musical Heritage Historical Entity Linking

arXiv:2502.09168v14 citationsh-index: 6Artif Intell Rev
Originality Incremental advance
AI Analysis

This work addresses entity linking for historical music documents, an incremental domain-specific problem for researchers in digital humanities and musicology.

The authors tackled the challenge of linking named entities in historical music texts to knowledge bases by introducing the MHERCL benchmark, which includes underrepresented entities, and proposed an unsupervised model and KG-enhanced methods that improved entity linking performance on these documents.

Linking named entities occurring in text to their corresponding entity in a Knowledge Base (KB) is challenging, especially when dealing with historical texts. In this work, we introduce Musical Heritage named Entities Recognition, Classification and Linking (MHERCL), a novel benchmark consisting of manually annotated sentences extrapolated from historical periodicals of the music domain. MHERCL contains named entities under-represented or absent in the most famous KBs. We experiment with several State-of-the-Art models on the Entity Linking (EL) task and show that MHERCL is a challenging dataset for all of them. We propose a novel unsupervised EL model and a method to extend supervised entity linkers by using Knowledge Graphs (KGs) to tackle the main difficulties posed by historical documents. Our experiments reveal that relying on unsupervised techniques and improving models with logical constraints based on KGs and heuristics to predict NIL entities (entities not represented in the KB of reference) results in better EL performance on historical documents.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes