LG SIJul 27, 2024

Can Modifying Data Address Graph Domain Adaptation?

Renhong Huang, Jiarong Xu, Xin Jiang, Ruichuan An, Yang Yang

arXiv:2407.19311v111.514 citationsh-index: 15Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of knowledge transfer in graph neural networks for real-world applications with domain shifts, offering a novel data-centric approach that is incremental but shows strong gains.

The paper tackles the problem of graph neural networks struggling with distribution shifts across domains by proposing a data-centric method for unsupervised graph domain adaptation, which generates a small transferable graph and achieves an average performance improvement of 2.16% over baselines while using only 0.25-1% of the original training data.

Graph neural networks (GNNs) have demonstrated remarkable success in numerous graph analytical tasks. Yet, their effectiveness is often compromised in real-world scenarios due to distribution shifts, limiting their capacity for knowledge transfer across changing environments or domains. Recently, Unsupervised Graph Domain Adaptation (UGDA) has been introduced to resolve this issue. UGDA aims to facilitate knowledge transfer from a labeled source graph to an unlabeled target graph. Current UGDA efforts primarily focus on model-centric methods, such as employing domain invariant learning strategies and designing model architectures. However, our critical examination reveals the limitations inherent to these model-centric methods, while a data-centric method allowed to modify the source graph provably demonstrates considerable potential. This insight motivates us to explore UGDA from a data-centric perspective. By revisiting the theoretical generalization bound for UGDA, we identify two data-centric principles for UGDA: alignment principle and rescaling principle. Guided by these principles, we propose GraphAlign, a novel UGDA method that generates a small yet transferable graph. By exclusively training a GNN on this new graph with classic Empirical Risk Minimization (ERM), GraphAlign attains exceptional performance on the target graph. Extensive experiments under various transfer scenarios demonstrate the GraphAlign outperforms the best baselines by an average of 2.16%, training on the generated graph as small as 0.25~1% of the original training graph.

View on arXiv PDF Code

Similar