LG AIFeb 14, 2024

Scalable Graph Self-Supervised Learning

Ali Saheb Pasand, Reza Moravej, Mahdi Biparva, Raika Karimi, Ali Ghodsi

arXiv:2402.09603v12.6h-index: 4

Originality Incremental advance

AI Analysis

This addresses scalability challenges for researchers and practitioners working with large real-world graphs, though it is incremental as it builds on existing regularization SSL methods.

The paper tackles the computational scalability issue in non-contrastive graph self-supervised learning by proposing node or dimension sampling to reduce loss computation costs, achieving improved downstream performance without lowering accuracy.

In regularization Self-Supervised Learning (SSL) methods for graphs, computational complexity increases with the number of nodes in graphs and embedding dimensions. To mitigate the scalability of non-contrastive graph SSL, we propose a novel approach to reduce the cost of computing the covariance matrix for the pre-training loss function with volume-maximization terms. Our work focuses on reducing the cost associated with the loss computation via graph node or dimension sampling. We provide theoretical insight into why dimension sampling would result in accurate loss computations and support it with mathematical derivation of the novel approach. We develop our experimental setup on the node-level graph prediction tasks, where SSL pre-training has shown to be difficult due to the large size of real world graphs. Our experiments demonstrate that the cost associated with the loss computation can be reduced via node or dimension sampling without lowering the downstream performance. Our results demonstrate that sampling mostly results in improved downstream performance. Ablation studies and experimental analysis are provided to untangle the role of the different factors in the experimental setup.

View on arXiv PDF

Similar