LGOct 27, 2021

Towards a Taxonomy of Graph Learning Datasets

Renming Liu, Semih Cantürk, Frederik Wenkel, Dylan Sandfelder, Devin Kreuzer, Anna Little, Sarah McGuire, Leslie O'Bray, Michael Perlmutter, Bastian Rieck, Matthew Hirn, Guy Wolf

arXiv:2110.14809v11.6

Originality Synthesis-oriented

AI Analysis

This work addresses a foundational problem for the graph learning community by providing a taxonomy to improve benchmarking and model development, though it is incremental as it builds on existing dataset analysis.

The authors tackled the lack of systematic understanding in graph neural network (GNN) benchmarking by developing a principled approach to taxonomize graph datasets using designed perturbations, resulting in a new understanding of critical dataset characteristics for better model evaluation and specialized GNN development.

Graph neural networks (GNNs) have attracted much attention due to their ability to leverage the intrinsic geometries of the underlying data. Although many different types of GNN models have been developed, with many benchmarking procedures to demonstrate the superiority of one GNN model over the others, there is a lack of systematic understanding of the underlying benchmarking datasets, and what aspects of the model are being tested. Here, we provide a principled approach to taxonomize graph benchmarking datasets by carefully designing a collection of graph perturbations to probe the essential data characteristics that GNN models leverage to perform predictions. Our data-driven taxonomization of graph datasets provides a new understanding of critical dataset characteristics that will enable better model evaluation and the development of more specialized GNN models.

View on arXiv PDF

Similar