LGAISep 28, 2025

Pure Node Selection for Imbalanced Graph Node Classification

arXiv:2509.23662v1h-index: 13Has CodeNeural Networks
Originality Incremental advance
AI Analysis

This addresses performance instability in graph neural networks for imbalanced classification, which is an incremental improvement over existing specialized imbalance-handling methods.

The paper tackles the Randomness Anomalous Connectivity Problem (RACP) in imbalanced graph node classification, where random seeds cause performance degradation in GNNs, and proposes Pure Node Sampling (PNS) as a plug-and-play module that eliminates this effect and outperforms baselines across various datasets and GNN backbones.

The problem of class imbalance refers to an uneven distribution of quantity among classes in a dataset, where some classes are significantly underrepresented compared to others. Class imbalance is also prevalent in graph-structured data. Graph neural networks (GNNs) are typically based on the assumption of class balance, often overlooking the issue of class imbalance. In our investigation, we identified a problem, which we term the Randomness Anomalous Connectivity Problem (RACP), where certain off-the-shelf models are affected by random seeds, leading to a significant performance degradation. To eliminate the influence of random factors in algorithms, we proposed PNS (Pure Node Sampling) to address the RACP in the node synthesis stage. Unlike existing approaches that design specialized algorithms to handle either quantity imbalance or topological imbalance, PNS is a novel plug-and-play module that operates directly during node synthesis to mitigate RACP. Moreover, PNS also alleviates performance degradation caused by abnormal distribution of node neighbors. We conduct a series of experiments to identify what factors are influenced by random seeds. Experimental results demonstrate the effectiveness and stability of our method, which not only eliminates the effect of unfavorable random seeds but also outperforms the baseline across various benchmark datasets with different GNN backbones. Data and code are available at https://github.com/flzeng1/PNS.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes