LG SIApr 4, 2022

Synthetic Graph Generation to Benchmark Graph Learning

Anton Tsitsulin, Benedek Rozemberczki, John Palowitch, Bryan Perozzi

arXiv:2204.01376v116.934 citationsh-index: 36Has Code

Originality Incremental advance

AI Analysis

This addresses a bottleneck for researchers in graph learning by enabling more thorough algorithm evaluation, though it is incremental as it builds on existing benchmarking needs.

The paper tackles the problem of limited datasets for benchmarking graph learning algorithms by proposing a synthetic graph generator to study algorithm behavior in controlled scenarios, showing in a case study how it provides insights into unsupervised and supervised graph neural network models.

Graph learning algorithms have attained state-of-the-art performance on many graph analysis tasks such as node classification, link prediction, and clustering. It has, however, become hard to track the field's burgeoning progress. One reason is due to the very small number of datasets used in practice to benchmark the performance of graph learning algorithms. This shockingly small sample size (~10) allows for only limited scientific insight into the problem. In this work, we aim to address this deficiency. We propose to generate synthetic graphs, and study the behaviour of graph learning algorithms in a controlled scenario. We develop a fully-featured synthetic graph generator that allows deep inspection of different models. We argue that synthetic graph generations allows for thorough investigation of algorithms and provides more insights than overfitting on three citation datasets. In the case study, we show how our framework provides insight into unsupervised and supervised graph neural network models.

View on arXiv PDF Code

Similar