LGAIFeb 14, 2024

Scalable Graph Self-Supervised Learning

arXiv:2402.09603v1h-index: 4
Originality Incremental advance
AI Analysis

This addresses scalability challenges for researchers and practitioners working with large real-world graphs, though it is incremental as it builds on existing regularization SSL methods.

The paper tackles the computational scalability issue in non-contrastive graph self-supervised learning by proposing node or dimension sampling to reduce loss computation costs, achieving improved downstream performance without lowering accuracy.

In regularization Self-Supervised Learning (SSL) methods for graphs, computational complexity increases with the number of nodes in graphs and embedding dimensions. To mitigate the scalability of non-contrastive graph SSL, we propose a novel approach to reduce the cost of computing the covariance matrix for the pre-training loss function with volume-maximization terms. Our work focuses on reducing the cost associated with the loss computation via graph node or dimension sampling. We provide theoretical insight into why dimension sampling would result in accurate loss computations and support it with mathematical derivation of the novel approach. We develop our experimental setup on the node-level graph prediction tasks, where SSL pre-training has shown to be difficult due to the large size of real world graphs. Our experiments demonstrate that the cost associated with the loss computation can be reduced via node or dimension sampling without lowering the downstream performance. Our results demonstrate that sampling mostly results in improved downstream performance. Ablation studies and experimental analysis are provided to untangle the role of the different factors in the experimental setup.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes