LGAug 21, 2022

MentorGNN: Deriving Curriculum for Pre-Training GNNs

Dawei Zhou, Lecheng Zheng, Dongqi Fu, Jiawei Han, Jingrui He

arXiv:2208.09905v116.526 citationsh-index: 24Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of bias and poor generalization in graph neural network pre-training for the graph mining community, offering an incremental improvement over hand-engineered approaches.

The paper tackles the problem of sub-optimal hyperparameter tuning in graph pre-training by proposing MentorGNN, an end-to-end model that automatically derives a curriculum to re-weight graph signals, achieving strong generalization in downstream tasks as validated on real graphs.

Graph pre-training strategies have been attracting a surge of attention in the graph mining community, due to their flexibility in parameterizing graph neural networks (GNNs) without any label information. The key idea lies in encoding valuable information into the backbone GNNs, by predicting the masked graph signals extracted from the input graphs. In order to balance the importance of diverse graph signals (e.g., nodes, edges, subgraphs), the existing approaches are mostly hand-engineered by introducing hyperparameters to re-weight the importance of graph signals. However, human interventions with sub-optimal hyperparameters often inject additional bias and deteriorate the generalization performance in the downstream applications. This paper addresses these limitations from a new perspective, i.e., deriving curriculum for pre-training GNNs. We propose an end-to-end model named MentorGNN that aims to supervise the pre-training process of GNNs across graphs with diverse structures and disparate feature spaces. To comprehend heterogeneous graph signals at different granularities, we propose a curriculum learning paradigm that automatically re-weighs graph signals in order to ensure a good generalization in the target domain. Moreover, we shed new light on the problem of domain adaption on relational data (i.e., graphs) by deriving a natural and interpretable upper bound on the generalization error of the pre-trained GNNs. Extensive experiments on a wealth of real graphs validate and verify the performance of MentorGNN.

View on arXiv PDF Code

Similar