GraphProp: Training the Graph Foundation Models using Graph Properties
This work addresses the challenge of cross-domain generalization in graph machine learning, which is important for applications like graph classification, but it appears incremental as it builds on existing GFM approaches by focusing on structural properties.
The paper tackles the problem of training graph foundation models (GFMs) with strong generalization for graph-level tasks by emphasizing structural information over node features, and it shows that GraphProp significantly outperforms competitors in supervised and few-shot learning, particularly for graphs without node attributes.
This work focuses on training graph foundation models (GFMs) that have strong generalization ability in graph-level tasks such as graph classification. Effective GFM training requires capturing information consistent across different domains. We discover that graph structures provide more consistent cross-domain information compared to node features and graph labels. However, traditional GFMs primarily focus on transferring node features from various domains into a unified representation space but often lack structural cross-domain generalization. To address this, we introduce GraphProp, which emphasizes structural generalization. The training process of GraphProp consists of two main phases. First, we train a structural GFM by predicting graph invariants. Since graph invariants are properties of graphs that depend only on the abstract structure, not on particular labellings or drawings of the graph, this structural GFM has a strong ability to capture the abstract structural information and provide discriminative graph representations comparable across diverse domains. In the second phase, we use the representations given by the structural GFM as positional encodings to train a comprehensive GFM. This phase utilizes domain-specific node attributes and graph labels to further improve cross-domain node feature generalization. Our experiments demonstrate that GraphProp significantly outperforms the competitors in supervised learning and few-shot learning, especially in handling graphs without node attributes.