LGJun 18, 2022

Beyond Real-world Benchmark Datasets: An Empirical Study of Node Classification with GNNs

arXiv:2206.09144v632 citationsh-index: 21
Originality Incremental advance
AI Analysis

This work addresses the need for fine-grained analysis of GNNs for researchers, though it is incremental as it builds on existing evaluation methods by introducing controlled synthetic graphs.

The paper tackles the problem of limited evaluation of Graph Neural Networks (GNNs) for node classification by conducting extensive experiments with a synthetic graph generator to analyze GNN performance across four graph characteristics: class size distributions, edge connection proportions, attribute values, and graph sizes, clarifying their strengths and weaknesses.

Graph Neural Networks (GNNs) have achieved great success on a node classification task. Despite the broad interest in developing and evaluating GNNs, they have been assessed with limited benchmark datasets. As a result, the existing evaluation of GNNs lacks fine-grained analysis from various characteristics of graphs. Motivated by this, we conduct extensive experiments with a synthetic graph generator that can generate graphs having controlled characteristics for fine-grained analysis. Our empirical studies clarify the strengths and weaknesses of GNNs from four major characteristics of real-world graphs with class labels of nodes, i.e., 1) class size distributions (balanced vs. imbalanced), 2) edge connection proportions between classes (homophilic vs. heterophilic), 3) attribute values (biased vs. random), and 4) graph sizes (small vs. large). In addition, to foster future research on GNNs, we publicly release our codebase that allows users to evaluate various GNNs with various graphs. We hope this work offers interesting insights for future research.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes