LG SIAug 23, 2021

Jointly Learnable Data Augmentations for Self-Supervised GNNs

Zekarias T. Kefato, Sarunas Girdzijauskas, Hannes Stärk

arXiv:2108.10420v15.54 citationsHas Code

Originality Highly original

AI Analysis

This work addresses the need for more efficient and effective self-supervised learning on graphs, offering a novel approach to data augmentation that reduces memory and runtime costs.

The authors tackled the problem of heuristic data augmentations in self-supervised graph neural networks by proposing GraphSurgeon, a method with learnable data augmentations and a post-augmentation strategy, achieving performance comparable to six state-of-the-art semi-supervised and five self-supervised baselines on 14 datasets.

Self-supervised Learning (SSL) aims at learning representations of objects without relying on manual labeling. Recently, a number of SSL methods for graph representation learning have achieved performance comparable to SOTA semi-supervised GNNs. A Siamese network, which relies on data augmentation, is the popular architecture used in these methods. However, these methods rely on heuristically crafted data augmentation techniques. Furthermore, they use either contrastive terms or other tricks (e.g., asymmetry) to avoid trivial solutions that can occur in Siamese networks. In this study, we propose, GraphSurgeon, a novel SSL method for GNNs with the following features. First, instead of heuristics we propose a learnable data augmentation method that is jointly learned with the embeddings by leveraging the inherent signal encoded in the graph. In addition, we take advantage of the flexibility of the learnable data augmentation and introduce a new strategy that augments in the embedding space, called post augmentation. This strategy has a significantly lower memory overhead and run-time cost. Second, as it is difficult to sample truly contrastive terms, we avoid explicit negative sampling. Third, instead of relying on engineering tricks, we use a scalable constrained optimization objective motivated by Laplacian Eigenmaps to avoid trivial solutions. To validate the practical use of GraphSurgeon, we perform empirical evaluation using 14 public datasets across a number of domains and ranging from small to large scale graphs with hundreds of millions of edges. Our finding shows that GraphSurgeon is comparable to six SOTA semi-supervised and on par with five SOTA self-supervised baselines in node classification tasks. The source code is available at https://github.com/zekarias-tilahun/graph-surgeon.

View on arXiv PDF Code

Similar