LGAIJun 23, 2022

Similarity-aware Positive Instance Sampling for Graph Contrastive Pre-training

arXiv:2206.11959v1h-index: 84
Originality Incremental advance
AI Analysis

This addresses a key bottleneck in graph neural network pre-training for domains like molecular generation, though it is incremental as it builds on existing contrastive learning frameworks.

The paper tackles the problem of positive instance sampling in graph contrastive pre-training, where existing methods often produce information-deficient or illegal graph instances, by proposing a similarity-aware selection approach that maintains legality and similarity, resulting in GNN models that outperform from-scratch training and existing methods on 13 benchmark datasets.

Graph instance contrastive learning has been proved as an effective task for Graph Neural Network (GNN) pre-training. However, one key issue may seriously impede the representative power in existing works: Positive instances created by current methods often miss crucial information of graphs or even yield illegal instances (such as non-chemically-aware graphs in molecular generation). To remedy this issue, we propose to select positive graph instances directly from existing graphs in the training set, which ultimately maintains the legality and similarity to the target graphs. Our selection is based on certain domain-specific pair-wise similarity measurements as well as sampling from a hierarchical graph encoding similarity relations among graphs. Besides, we develop an adaptive node-level pre-training method to dynamically mask nodes to distribute them evenly in the graph. We conduct extensive experiments on $13$ graph classification and node classification benchmark datasets from various domains. The results demonstrate that the GNN models pre-trained by our strategies can outperform those trained-from-scratch models as well as the variants obtained by existing methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes