Lovro Šubelj

SI
h-index18
6papers
95citations
Novelty21%
AI Score29

6 Papers

LGMay 31, 2025Code
RelDiff: Relational Data Generative Modeling with Graph-Based Diffusion Models

Valter Hudovernik, Minkai Xu, Juntong Shi et al.

Real-world databases are predominantly relational, comprising multiple interlinked tables that contain complex structural and statistical dependencies. Learning generative models on relational data has shown great promise in generating synthetic data and imputing missing values. However, existing methods often struggle to capture this complexity, typically reducing relational data to conditionally generated flat tables and imposing limiting structural assumptions. To address these limitations, we introduce RelDiff, a novel diffusion generative model that synthesizes complete relational databases by explicitly modeling their foreign key graph structure. RelDiff combines a joint graph-conditioned diffusion process across all tables for attribute synthesis, and a $2K+$SBM graph generator based on the Stochastic Block Model for structure generation. The decomposition of graph structure and relational attributes ensures both high fidelity and referential integrity, both of which are crucial aspects of synthetic relational database generation. Experiments on 11 benchmark datasets demonstrate that RelDiff consistently outperforms prior methods in producing realistic and coherent synthetic relational databases. Code is available at https://github.com/ValterH/RelDiff.

SIAug 1, 2020
Learning-based link prediction analysis for Facebook100 network

Tim Poštuvan, Semir Salkić, Lovro Šubelj

In social network science, Facebook is one of the most interesting and widely used social networks and media platforms. Its data contributed to significant evolution of social network research and link prediction techniques, which are important tools in link mining and analysis. This paper gives the first comprehensive analysis of link prediction on the Facebook100 network. We study performance and evaluate multiple machine learning algorithms on different feature sets. To derive features we use network embeddings and topology-based techniques such as node2vec and vectors of similarity metrics. In addition, we also employ node-based features, which are available for Facebook100 network, but rarely found in other datasets. The adopted approaches are discussed and results are clearly presented. Lastly, we compare and review applied models, where overall performance and classification rates are presented.

SIJun 22, 2019
Predicting kills in Game of Thrones using network properties

Jaka Stavanja, Matej Klemen, Lovro Šubelj

TV series such as HBO's Game of Thrones have seen a high number of dedicated followers, mostly due to the dramatic murders of the most important characters. In our work, we try to predict killer and victim pairs using data about previous kills and additional metadata. We construct a network where two character nodes are linked if one killed the other and use a link prediction framework to evaluate different techniques for kill predictions. Lastly, we compute various network properties on a social network of characters and use them as features in conjunction with classic data mining techniques. Due to the small size of the dataset and the somewhat random kill distribution, we cannot predict much with standard indices alone, although using them in conjunction with additional rules based on degrees works surprisingly well. The features we compute on the social network help the classic machine learning approaches, but do not yield very accurate predictions. The best results overall are achieved using indices that use simple degree information, the best of which gives us the Area Under the ROC Curve of 0.875.

IRJul 26, 2018
General Context-Aware Data Matching and Merging Framework

Slavko Žitnik, Lovro Šubelj, Dejan Lavbič et al.

Due to numerous public information sources and services, many methods to combine heterogeneous data were proposed recently. However, general end-to-end solutions are still rare, especially systems taking into account different context dimensions. Therefore, the techniques often prove insufficient or are limited to a certain domain. In this paper we briefly review and rigorously evaluate a general framework for data matching and merging. The framework employs collective entity resolution and redundancy elimination using three dimensions of context types. In order to achieve domain independent results, data is enriched with semantics and trust. However, the main contribution of the paper is evaluation on five public domain-incompatible datasets. Furthermore, we introduce additional attribute, relationship, semantic and trust metrics, which allow complete framework management. Besides overall results improvement within the framework, metrics could be of independent interest.

SIFeb 17, 2015
Node mixing and group structure of complex software networks

Lovro Šubelj, Slavko Žitnik, Neli Blagus et al.

Large software projects are among most sophisticated human-made systems consisting of a network of interdependent parts. Past studies of software systems from the perspective of complex networks have already led to notable discoveries with different applications. Nevertheless, our comprehension of the structure of software networks remains to be only partial. We here investigate correlations or mixing between linked nodes and show that software networks reveal dichotomous node degree mixing similar to that recently observed in biological networks. We further show that software networks also reveal characteristic clustering profiles and mixing. Hence, node mixing in software networks significantly differs from that in, e.g., the Internet or social networks. We explain the observed mixing through the presence of groups of nodes with common linking pattern. More precisely, besides densely linked groups known as communities, software networks also consist of disconnected groups denoted modules, core/periphery structures and other. Moreover, groups coincide with the intrinsic properties of the underlying software projects, which promotes practical applications in software engineering.

SIAug 13, 2012
Software systems through complex networks science: Review, analysis and applications

Lovro Šubelj, Marko Bajec

Complex software systems are among most sophisticated human-made systems, yet only little is known about the actual structure of 'good' software. We here study different software systems developed in Java from the perspective of network science. The study reveals that network theory can provide a prominent set of techniques for the exploratory analysis of large complex software system. We further identify several applications in software engineering, and propose different network-based quality indicators that address software design, efficiency, reusability, vulnerability, controllability and other. We also highlight various interesting findings, e.g., software systems are highly vulnerable to processes like bug propagation, however, they are not easily controllable.