SILGNov 24, 2025

Large Scale Community-Aware Network Generation

arXiv:2511.19717v1
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of generating synthetic networks with ground-truth community labels for researchers in network analysis, though it is incremental as it builds on the existing RECCS algorithm.

The paper tackles the challenge of evaluating community detection algorithms by enhancing the RECCS synthetic network generator to improve scalability, resulting in RECCS+ and RECCS++ achieving speedups of up to 49x and 139x respectively, with RECCS++ scaling to networks with over 100 million nodes and nearly 2 billion edges.

Community detection, or network clustering, is used to identify latent community structure in networks. Due to the scarcity of labeled ground truth in real-world networks, evaluating these algorithms poses significant challenges. To address this, researchers use synthetic network generators that produce networks with ground-truth community labels. RECCS is one such algorithm that takes a network and its clustering as input and generates a synthetic network through a modular pipeline. Each generated ground truth cluster preserves key characteristics of the corresponding input cluster, including connectivity, minimum degree, and degree sequence distribution. The output consists of a synthetically generated network, and disjoint ground truth cluster labels for all nodes. In this paper, we present two enhanced versions: RECCS+ and RECCS++. RECCS+ maintains algorithmic fidelity to the original RECCS while introducing parallelization through an orchestrator that coordinates algorithmic components across multiple processes and employs multithreading. RECCS++ builds upon this foundation with additional algorithmic optimizations to achieve further speedup. Our experimental results demonstrate that RECCS+ and RECCS++ achieve speedups of up to 49x and 139x respectively on our benchmark datasets, with RECCS++'s additional performance gains involving a modest accuracy tradeoff. With this newfound performance, RECCS++ can now scale to networks with over 100 million nodes and nearly 2 billion edges.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes