DLIRJan 20, 2021

What is all this new MeSH about? Exploring the semantic provenance of new descriptors in the MeSH thesaurus

arXiv:2101.08293v38 citations
AI Analysis

This work provides insights into thesaurus evolution for biomedical researchers and curators, but it is incremental as it builds on existing classification methods.

The paper tackled the problem of understanding how new descriptors are introduced in the Medical Subject Headings (MeSH) thesaurus by analyzing their semantic provenance over 15 years, finding that only about 25% represent new concepts while the rest were previously covered by existing descriptors.

The Medical Subject Headings (MeSH) thesaurus is a controlled vocabulary widely used in biomedical knowledge systems, particularly for semantic indexing of scientific literature. As the MeSH hierarchy evolves through annual version updates, some new descriptors are introduced that were not previously available. This paper explores the conceptual provenance of these new descriptors. In particular, we investigate whether such new descriptors have been previously covered by older descriptors and what is their current relation to them. To this end, we propose a framework to categorize new descriptors based on their current relation to older descriptors. Based on the proposed classification scheme, we quantify, analyse and present the different types of new descriptors introduced in MeSH during the last fifteen years. The results show that only about 25% of new MeSH descriptors correspond to new emerging concepts, whereas the rest were previously covered by one or more existing descriptors, either implicitly or explicitly. Most of them were covered by a single existing descriptor and they usually end up as descendants of it in the current hierarchy, gradually leading towards a more fine-grained MeSH vocabulary. These insights about the dynamics of the thesaurus are useful for the retrospective study of scientific articles annotated with MeSH, but could also be used to inform the policy of updating the thesaurus in the future.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes