Supercharging Graph Transformers with Advective Diffusion
This addresses a key challenge in graph learning for domains such as biology and social networks, offering a novel approach to improve generalization, though it appears incremental by building on existing graph Transformer and diffusion methods.
The paper tackles the problem of how machine learning models generalize under topological shifts in graph data, proposing the Advective Diffusion Transformer (AdvDIFFormer) which demonstrates provable capability for controlling generalization error and shows empirical superiority in tasks like information networks, molecular screening, and protein interactions.
The capability of generalization is a cornerstone for the success of modern learning systems. For non-Euclidean data, e.g., graphs, that particularly involves topological structures, one important aspect neglected by prior studies is how machine learning models generalize under topological shifts. This paper proposes Advective Diffusion Transformer (AdvDIFFormer), a physics-inspired graph Transformer model designed to address this challenge. The model is derived from advective diffusion equations which describe a class of continuous message passing process with observed and latent topological structures. We show that AdvDIFFormer has provable capability for controlling generalization error with topological shifts, which in contrast cannot be guaranteed by graph diffusion models, i.e., the generalized formulation of common graph neural networks in continuous space. Empirically, the model demonstrates superiority in various predictive tasks across information networks, molecular screening and protein interactions.