SIFeb 2
Twinning Complex Networked Systems: Data-Driven Calibration of the mABCD Synthetic Graph GeneratorPiotr Bródka, Michał Czuba, Bogumił Kamiński et al.
The increasing availability of relational data has contributed to a growing reliance on network-based representations of complex systems. Over time, these models have evolved to capture more nuanced properties, such as the heterogeneity of relationships, leading to the concept of multilayer networks. However, the analysis and evaluation of methods for these structures is often hindered by the limited availability of large-scale empirical data. As a result, graph generators are commonly used as a workaround, albeit at the cost of introducing systematic biases. In this paper, we address the inverse-generator problem by inferring the configuration parameters of a multilayer network generator, mABCD, from a real-world system. Our goal is to identify parameter settings that enable the generator to produce synthetic networks that act as digital twins of the original structure. We propose a method for estimating matching configurations and for quantifying the associated error. Our results demonstrate that this task is non-trivial, as strong interdependencies between configuration parameters weaken independent estimation and instead favour a joint-prediction approach.
SIDec 2, 2024
Identifying Key Nodes for the Influence Spread using a Machine Learning ApproachMateusz Stolarski, Adam Piróg, Piotr Bródka
The identification of key nodes in complex networks is an important topic in many network science areas. It is vital to a variety of real-world applications, including viral marketing, epidemic spreading and influence maximization. In recent years, machine learning algorithms have proven to outperform the conventional, centrality-based methods in accuracy and consistency, but this approach still requires further refinement. What information about the influencers can be extracted from the network? How can we precisely obtain the labels required for training? Can these models generalize well? In this paper, we answer these questions by presenting an enhanced machine learning-based framework for the influence spread problem. We focus on identifying key nodes for the Independent Cascade model, which is a popular reference method. Our main contribution is an improved process of obtaining the labels required for training by introducing 'Smart Bins' and proving their advantage over known methods. Next, we show that our methodology allows ML models to not only predict the influence of a given node, but to also determine other characteristics of the spreading process-which is another novelty to the relevant literature. Finally, we extensively test our framework and its ability to generalize beyond complex networks of different types and sizes, gaining important insight into the properties of these methods.
1.7CEApr 14
The Elusive Nature of Roughness: Linking Hydraulics and Graph Theory for Water Distribution Networks Model CalibrationKarol Dykiert, Mateusz Stolarski, Michał Czuba et al.
Accurate pipe roughness estimation in large-scale water distribution networks is often hindered by the high cost of traditional field methods. This study investigates whether network partitioning, by utilizing hydraulic and graph-derived attributes, can enhance the calibration of these parameters. Using a high-fidelity model of a real network as a benchmark, we evaluate density-based clustering, and topology-driven grouping strategies. Optimization experiments demonstrate that attribute-based grouping yields stable, repeatable results comparable to manual calibration for hydraulically significant pipes. While hydraulic attributes generate more distinct cluster structures, the inclusion of graph-based data improves calibration robustness by stabilizing the optimization process. Notably, density-based clustering achieves similar accuracy to k-means while reducing computational effort in specific configurations. Although the method does not eliminate all sources of uncertainty, results suggest that topology-informed grouping provides a systematic, reproducible, and computationally efficient alternative to manual heuristics, highlighting the critical role of network structure in reliable parameter estimation.
SIMay 27, 2025
Identifying Super Spreaders in Multilayer NetworksMichał Czuba, Mateusz Stolarski, Adam Piróg et al.
Identifying super-spreaders can be framed as a subtask of the influence maximisation problem. It seeks to pinpoint agents within a network that, if selected as single diffusion seeds, disseminate information most effectively. Multilayer networks, a specific class of heterogeneous graphs, can capture diverse types of interactions (e.g., physical-virtual or professional-social), and thus offer a more accurate representation of complex relational structures. In this work, we introduce a novel approach to identifying super-spreaders in such networks by leveraging graph neural networks. To this end, we construct a dataset by simulating information diffusion across hundreds of networks - to the best of our knowledge, the first of its kind tailored specifically to multilayer networks. We further formulate the task as a variation of the ranking prediction problem based on a four-dimensional vector that quantifies each agent's spreading potential: (i) the number of activations; (ii) the duration of the diffusion process; (iii) the peak number of activations; and (iv) the simulation step at which this peak occurs. Our model, TopSpreadersNetwork, comprises a relationship-agnostic encoder and a custom aggregation layer. This design enables generalisation to previously unseen data and adapts to varying graph sizes. In an extensive evaluation, we compare our model against classic centrality-based heuristics and competitive deep learning methods. The results, obtained across a broad spectrum of real-world and synthetic multilayer networks, demonstrate that TopSpreadersNetwork achieves superior performance in identifying high-impact nodes, while also offering improved interpretability through its structured output.