LGMay 22, 2025

Graph Data Selection for Domain Adaptation: A Model-Free Approach

Ting-Wei Li, Ruizhong Qiu, Hanghang Tong

arXiv:2505.17293v216.97 citationsh-index: 13

Originality Highly original

AI Analysis

This addresses domain adaptation challenges in graph machine learning, particularly under severe distribution shifts and computational constraints, offering a complementary approach to model-centric methods.

The paper tackles the problem of graph domain adaptation by proposing GRADATE, a model-free framework that selects optimal training data from the source domain to improve classification on the target domain, demonstrating superior performance and data efficiency compared to existing methods.

Graph domain adaptation (GDA) is a fundamental task in graph machine learning, with techniques like shift-robust graph neural networks (GNNs) and specialized training procedures to tackle the distribution shift problem. Although these model-centric approaches show promising results, they often struggle with severe shifts and constrained computational resources. To address these challenges, we propose a novel model-free framework, GRADATE (GRAph DATa sElector), that selects the best training data from the source domain for the classification task on the target domain. GRADATE picks training samples without relying on any GNN model's predictions or training recipes, leveraging optimal transport theory to capture and adapt to distribution changes. GRADATE is data-efficient, scalable and meanwhile complements existing model-centric GDA approaches. Through comprehensive empirical studies on several real-world graph-level datasets and multiple covariate shift types, we demonstrate that GRADATE outperforms existing selection methods and enhances off-the-shelf GDA methods with much fewer training data.

View on arXiv PDF

Similar