AIAug 28, 2024

Hierarchical Blockmodelling for Knowledge Graphs

Marcin Pietrasik, Marek Reformat, Anna Wilbik

arXiv:2408.15649v12.32 citationsh-index: 4Has Code

Originality Incremental advance

AI Analysis

This work provides a foundational step for applying stochastic blockmodels to knowledge graphs, though it is incremental as it adapts existing probabilistic methods to a new domain.

The paper tackled hierarchical entity clustering in knowledge graphs by integrating the Nested Chinese Restaurant Process and Stick Breaking Process into a stochastic blockmodel, achieving coherent cluster hierarchies in small-scale settings as evaluated on synthetic and real-world datasets.

In this paper, we investigate the use of probabilistic graphical models, specifically stochastic blockmodels, for the purpose of hierarchical entity clustering on knowledge graphs. These models, seldom used in the Semantic Web community, decompose a graph into a set of probability distributions. The parameters of these distributions are then inferred allowing for their subsequent sampling to generate a random graph. In a non-parametric setting, this allows for the induction of hierarchical clusterings without prior constraints on the hierarchy's structure. Specifically, this is achieved by the integration of the Nested Chinese Restaurant Process and the Stick Breaking Process into the generative model. In this regard, we propose a model leveraging such integration and derive a collapsed Gibbs sampling scheme for its inference. To aid in understanding, we describe the steps in this derivation and provide an implementation for the sampler. We evaluate our model on synthetic and real-world datasets and quantitatively compare against benchmark models. We further evaluate our results qualitatively and find that our model is capable of inducing coherent cluster hierarchies in small scale settings. The work presented in this paper provides the first step for the further application of stochastic blockmodels for knowledge graphs on a larger scale. We conclude the paper with potential avenues for future work on more scalable inference schemes.

View on arXiv PDF Code

Similar