LGSep 9, 2024

Graffin: Stand for Tails in Imbalanced Node Classification

arXiv:2409.05339v12.6h-index: 9

Originality Incremental advance

AI Analysis

This addresses the challenge of imbalanced data distribution in graph representation learning for real-world applications, though it is an incremental improvement over existing methods.

The authors tackled the problem of imbalanced node classification in graphs by proposing Graffin, a pluggable tail data augmentation module that improves adaptation to tail data without significantly degrading overall model performance, as validated on four real-world datasets.

Graph representation learning (GRL) models have succeeded in many scenarios. Real-world graphs have imbalanced distribution, such as node labels and degrees, which leaves a critical challenge to GRL. Imbalanced inputs can lead to imbalanced outputs. However, most existing works ignore it and assume that the distribution of input graphs is balanced, which cannot align with real situations, resulting in worse model performance on tail data. The domination of head data makes tail data underrepresented when training graph neural networks (GNNs). Thus, we propose Graffin, a pluggable tail data augmentation module, to address the above issues. Inspired by recurrent neural networks (RNNs), Graffin flows head features into tail data through graph serialization techniques to alleviate the imbalance of tail representation. The local and global structures are fused to form the node representation under the combined effect of neighborhood and sequence information, which enriches the semantics of tail data. We validate the performance of Graffin on four real-world datasets in node classification tasks. Results show that Graffin can improve the adaptation to tail data without significantly degrading the overall model performance.

View on arXiv PDF

Similar