LG AIApr 19, 2024

Multi-View Subgraph Neural Networks: Self-Supervised Learning with Scarce Labeled Data

Zhenzhong Wang, Qingyuan Zeng, Wanyu Lin, Min Jiang, Kay Chen Tan

arXiv:2404.12569v19.211 citationsh-index: 10IEEE Trans Neural Netw Learn Syst

Originality Highly original

AI Analysis

This work addresses the challenge of low-data regimes in graph-based node classification for real-world applications, offering a novel approach to enhance representation learning.

The paper tackles the problem of graph neural networks (GNNs) overfitting due to scarce labeled data by proposing a self-supervised learning framework called Muse, which uses multi-view subgraphs to capture long-range dependencies and local structure, resulting in improved performance on node classification tasks with limited labeled data.

While graph neural networks (GNNs) have become the de-facto standard for graph-based node classification, they impose a strong assumption on the availability of sufficient labeled samples. This assumption restricts the classification performance of prevailing GNNs on many real-world applications suffering from low-data regimes. Specifically, features extracted from scarce labeled nodes could not provide sufficient supervision for the unlabeled samples, leading to severe over-fitting. In this work, we point out that leveraging subgraphs to capture long-range dependencies can augment the representation of a node with homophily properties, thus alleviating the low-data regime. However, prior works leveraging subgraphs fail to capture the long-range dependencies among nodes. To this end, we present a novel self-supervised learning framework, called multi-view subgraph neural networks (Muse), for handling long-range dependencies. In particular, we propose an information theory-based identification mechanism to identify two types of subgraphs from the views of input space and latent space, respectively. The former is to capture the local structure of the graph, while the latter captures the long-range dependencies among nodes. By fusing these two views of subgraphs, the learned representations can preserve the topological properties of the graph at large, including the local structure and long-range dependencies, thus maximizing their expressiveness for downstream node classification tasks. Experimental results show that Muse outperforms the alternative methods on node classification tasks with limited labeled data.

View on arXiv PDF

Similar