LGDec 23, 2020

Motif-Driven Contrastive Learning of Graph Representations

arXiv:2012.12533v337 citations
AI Analysis

This work provides an incremental improvement in graph representation learning for researchers working on self-supervised pre-training of Graph Neural Networks, particularly for tasks requiring global structural understanding.

This paper addresses the limitation of node-level contrastive learning in capturing global graph structure by proposing a motif-driven approach for subgraph sampling. Their framework, MICRO-Graph, extracts frequently-occurring subgraph patterns (motifs) and uses them to sample informative subgraphs for contrastive learning, resulting in a 2.04% ROC-AUC average performance enhancement on downstream tasks when pre-trained on ogbg-molhiv.

Pre-training Graph Neural Networks (GNN) via self-supervised contrastive learning has recently drawn lots of attention. However, most existing works focus on node-level contrastive learning, which cannot capture global graph structure. The key challenge to conducting subgraph-level contrastive learning is to sample informative subgraphs that are semantically meaningful. To solve it, we propose to learn graph motifs, which are frequently-occurring subgraph patterns (e.g. functional groups of molecules), for better subgraph sampling. Our framework MotIf-driven Contrastive leaRning Of Graph representations (MICRO-Graph) can: 1) use GNNs to extract motifs from large graph datasets; 2) leverage learned motifs to sample informative subgraphs for contrastive learning of GNN. We formulate motif learning as a differentiable clustering problem, and adopt EM-clustering to group similar and significant subgraphs into several motifs. Guided by these learned motifs, a sampler is trained to generate more informative subgraphs, and these subgraphs are used to train GNNs through graph-to-subgraph contrastive learning. By pre-training on the ogbg-molhiv dataset with MICRO-Graph, the pre-trained GNN achieves 2.04% ROC-AUC average performance enhancement on various downstream benchmark datasets, which is significantly higher than other state-of-the-art self-supervised learning baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes