LG SPApr 29, 2023

Leveraging Label Non-Uniformity for Node Classification in Graph Neural Networks

Feng Ji, See Hian Lee, Hanyang Meng, Kai Zhao, Jielong Yang, Wee Peng Tay

arXiv:2305.00139v113.718 citationsh-index: 33Has Code

Originality Incremental advance

AI Analysis

This work addresses node classification accuracy in GNNs, offering an incremental improvement for graph-based machine learning tasks.

The paper tackles the problem of node classification in graph neural networks by introducing label non-uniformity, derived from the Wasserstein distance between softmax distributions and uniform distributions, to identify hard-to-classify nodes and improve model performance through targeted training or edge adjustments.

In node classification using graph neural networks (GNNs), a typical model generates logits for different class labels at each node. A softmax layer often outputs a label prediction based on the largest logit. We demonstrate that it is possible to infer hidden graph structural information from the dataset using these logits. We introduce the key notion of label non-uniformity, which is derived from the Wasserstein distance between the softmax distribution of the logits and the uniform distribution. We demonstrate that nodes with small label non-uniformity are harder to classify correctly. We theoretically analyze how the label non-uniformity varies across the graph, which provides insights into boosting the model performance: increasing training samples with high non-uniformity or dropping edges to reduce the maximal cut size of the node set of small non-uniformity. These mechanisms can be easily added to a base GNN model. Experimental results demonstrate that our approach improves the performance of many benchmark base models.

View on arXiv PDF Code

Similar