LG SIJun 8, 2023

Hybrid Graph: A Unified Graph Representation with Datasets and Benchmarks for Complex Graphs

Zehui Li, Xiangyu Zhao, Mingzhu Shen, Guy-Bart Stan, Pietro Liò, Yiren Zhao

arXiv:2306.05108v23.81 citationsh-index: 37

Originality Incremental advance

AI Analysis

This work addresses the need for standardized datasets and benchmarks for researchers and practitioners working on graph neural networks for complex, real-world networks.

The paper tackles the lack of unified modeling and evaluation for complex graphs with higher-order relations by introducing hybrid graphs as a unified definition and presenting the Hybrid Graph Benchmark (HGB), which includes 23 real-world datasets and an evaluation framework, revealing research gaps in GNN performance on these graphs.

Graphs are widely used to encapsulate a variety of data formats, but real-world networks often involve complex node relations beyond only being pairwise. While hypergraphs and hierarchical graphs have been developed and employed to account for the complex node relations, they cannot fully represent these complexities in practice. Additionally, though many Graph Neural Networks (GNNs) have been proposed for representation learning on higher-order graphs, they are usually only evaluated on simple graph datasets. Therefore, there is a need for a unified modelling of higher-order graphs, and a collection of comprehensive datasets with an accessible evaluation framework to fully understand the performance of these algorithms on complex graphs. In this paper, we introduce the concept of hybrid graphs, a unified definition for higher-order graphs, and present the Hybrid Graph Benchmark (HGB). HGB contains 23 real-world hybrid graph datasets across various domains such as biology, social media, and e-commerce. Furthermore, we provide an extensible evaluation framework and a supporting codebase to facilitate the training and evaluation of GNNs on HGB. Our empirical study of existing GNNs on HGB reveals various research opportunities and gaps, including (1) evaluating the actual performance improvement of hypergraph GNNs over simple graph GNNs; (2) comparing the impact of different sampling strategies on hybrid graph learning methods; and (3) exploring ways to integrate simple graph and hypergraph information. We make our source code and full datasets publicly available at https://zehui127.github.io/hybrid-graph-benchmark/.

View on arXiv PDF

Similar