Mohammad Abdel-Qader

2.2IRSep 30, 2017

Towards Understanding the Evolution of Vocabulary Terms in Knowledge Graphs

Mohammad Abdel-Qader, Ansgar Scherp

Vocabularies are used for modeling data in Knowledge Graphs (KG) like the Linked Open Data Cloud and Wikidata. During their lifetime, the vocabularies of the KGs are subject to changes. New terms are coined, while existing terms are modified or declared as deprecated. We first quantify the amount and frequency of changes in vocabularies. Subsequently, we investigate to which extend and when the changes are adopted in the evolution of the KGs. We conduct our experiments on three large-scale KGs for which time-stamped snapshots are available, namely the Billion Triples Challenge datasets, Dynamic Linked Data Observatory dataset, and Wikidata. Our results show that the change frequency of terms is rather low, but can have high impact when adopted on a large amount of distributed graph data on the web. Furthermore, not all coined terms are used and most of the deprecated terms are still used by data publishers. There are variations in the adoption time of terms coming from different vocabularies ranging from very fast (few days) to very slow (few years). Surprisingly, there are also adoptions we could observe even before the vocabulary changes are published. Understanding this adoption is important, since otherwise it may lead to wrong assumptions about the modeling status of data published on the web and may result in difficulties when querying the data from distributed sources.

Mohammad Abdel-Qader

1 Paper