46.6DCJun 2
SIGMA: A Versatile Streaming Graph Partitioner for Vertex- and Edge-Balanced Distributed GNN TrainingBarbara Hoffmann, Shai Dorian Peretz, Adil Chhabra et al.
Distributed Graph Neural Network (GNN) training depends critically on how the underlying graph is partitioned across compute resources. Existing graph partitioners focus either on vertex partitioning or edge partitioning and typically optimize only a single communication objective (edge cut or vertex cut) under a single balance constraint (vertex balance or edge balance). We present SIGMA (Streaming Integrated Graph Partitioning with Multi-objective Awareness), a versatile streaming graph partitioner that supports both vertex and edge partitioning within a unified multi-objective, multi-constraint framework. Depending on the target distributed GNN system, SIGMA can be configured for edgecut-oriented vertex partitioning or vertex-cut-oriented edge partitioning while simultaneously accounting for both vertex and edge balancing. A clustering-based preprocessing stage incorporates global graph structure to improve partition quality while preserving the efficiency and scalability advantages of streaming partitioning. We evaluate SIGMA on six benchmark graphs spanning diverse domains and scales using two distributed GNN training systems: Dist-GNN (edge-partitioned) and DistDGL (vertex-partitioned). Across both settings, SIGMA consistently achieves strong performance, showing its ability to navigate complex trade-offs between partition quality, training efficiency, and memory consumption, frequently outperforming streaming baselines while remaining competitive with high-quality in-memory partitioners such as METIS, KaHIP and HEP. These results demonstrate that a unified streaming partitioner can effectively address the communication, compute, and memory challenges of distributed GNN training across fundamentally different system architectures.
SIMay 11, 2022
Local Motif Clustering via (Hyper)Graph PartitioningAdil Chhabra, Marcelo Fonseca Faraj, Christian Schulz
A widely-used operation on graphs is local clustering, i.e., extracting a well-characterized community around a seed node without the need to process the whole graph. Recently local motif clustering has been proposed: it looks for a local cluster based on the distribution of motifs. Since this local clustering perspective is relatively new, most approaches proposed for it are extensions of statistical and numerical methods previously used for edge-based local clustering, while the available combinatorial approaches are still few and relatively simple. In this work, we build a hypergraph and a graph model which both represent the motif-distribution around the seed node. We solve these models using sophisticated combinatorial algorithms designed for (hyper)graph partitioning. In extensive experiments with the triangle motif, we observe that our algorithm computes communities with a motif conductance value being one third on average in comparison against the communities computed by the state-of-the-art tool MAPPR while being 6.3 times faster on average.
LGFeb 8, 2025
CluStRE: Streaming Graph Clustering with Multi-Stage RefinementAdil Chhabra, Shai Dorian Peretz, Christian Schulz
We present CluStRE, a novel streaming graph clustering algorithm that balances computational efficiency with high-quality clustering using multi-stage refinement. Unlike traditional in-memory clustering approaches, CluStRE processes graphs in a streaming setting, significantly reducing memory overhead while leveraging re-streaming and evolutionary heuristics to improve solution quality. Our method dynamically constructs a quotient graph, enabling modularity-based optimization while efficiently handling large-scale graphs. We introduce multiple configurations of CluStRE to provide trade-offs between speed, memory consumption, and clustering quality. Experimental evaluations demonstrate that CluStRE improves solution quality by 89.8%, operates 2.6 times faster, and uses less than two-thirds of the memory required by the state-of-the-art streaming clustering algorithm on average. Moreover, our strongest mode enhances solution quality by up to 150% on average. With this, CluStRE achieves comparable solution quality to in-memory algorithms, i.e. over 96% of the quality of clustering approaches, including Louvain, effectively bridging the gap between streaming and traditional clustering methods.