LGJun 16, 2015

Numeric Input Relations for Relational Learning with Applications to Community Structure Analysis

arXiv:1506.05055v11.1

Originality Incremental advance

AI Analysis

This work addresses the problem of incorporating numerical variables into relational learning for applications like community structure analysis, representing an incremental advancement in hybrid SRL models.

The paper tackles the limited handling of numerical data in statistical relational learning by introducing numeric input relations in the Relational Bayesian Network framework, enabling probabilistic models for multi-relational networks where link probabilities depend on numeric latent features, with a generic learning procedure for maximum-likelihood fitting.

Most work in the area of statistical relational learning (SRL) is focussed on discrete data, even though a few approaches for hybrid SRL models have been proposed that combine numerical and discrete variables. In this paper we distinguish numerical random variables for which a probability distribution is defined by the model from numerical input variables that are only used for conditioning the distribution of discrete response variables. We show how numerical input relations can very easily be used in the Relational Bayesian Network framework, and that existing inference and learning methods need only minor adjustments to be applied in this generalized setting. The resulting framework provides natural relational extensions of classical probabilistic models for categorical data. We demonstrate the usefulness of RBN models with numeric input relations by several examples. In particular, we use the augmented RBN framework to define probabilistic models for multi-relational (social) networks in which the probability of a link between two nodes depends on numeric latent feature vectors associated with the nodes. A generic learning procedure can be used to obtain a maximum-likelihood fit of model parameters and latent feature values for a variety of models that can be expressed in the high-level RBN representation. Specifically, we propose a model that allows us to interpret learned latent feature values as community centrality degrees by which we can identify nodes that are central for one community, that are hubs between communities, or that are isolated nodes. In a multi-relational setting, the model also provides a characterization of how different relations are associated with each community.

View on arXiv PDF

Similar