CLAIApr 30, 2025

Homa at SemEval-2025 Task 5: Aligning Librarian Records with OntoAligner for Subject Tagging

arXiv:2504.21474v13 citationsh-index: 16
Originality Synthesis-oriented
AI Analysis

This work addresses subject tagging in digital libraries, but it is incremental as it applies an existing alignment toolkit to a new dataset.

The paper tackled the problem of automatically assigning subject labels to technical records from TIBKAT using the GND taxonomy, and the result showed that OntoAligner demonstrated strengths and limitations in handling multilingual records for this task.

This paper presents our system, Homa, for SemEval-2025 Task 5: Subject Tagging, which focuses on automatically assigning subject labels to technical records from TIBKAT using the Gemeinsame Normdatei (GND) taxonomy. We leverage OntoAligner, a modular ontology alignment toolkit, to address this task by integrating retrieval-augmented generation (RAG) techniques. Our approach formulates the subject tagging problem as an alignment task, where records are matched to GND categories based on semantic similarity. We evaluate OntoAligner's adaptability for subject indexing and analyze its effectiveness in handling multilingual records. Experimental results demonstrate the strengths and limitations of this method, highlighting the potential of alignment techniques for improving subject tagging in digital libraries.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes