LGAIFeb 23

VecFormer: Towards Efficient and Generalizable Graph Transformer with Graph Token Attention

arXiv:2602.19622v1h-index: 10
Originality Incremental advance
AI Analysis

This addresses scalability and generalization issues in graph representation learning for researchers and practitioners, though it is incremental as it builds on existing Graph Transformer methods.

The paper tackles the computational inefficiency and poor generalization of Graph Transformers by proposing VecFormer, which uses a two-stage training paradigm with graph token attention to achieve faster and more accurate node classification, especially in out-of-distribution scenarios, as shown in experiments across various datasets.

Graph Transformer has demonstrated impressive capabilities in the field of graph representation learning. However, existing approaches face two critical challenges: (1) most models suffer from exponentially increasing computational complexity, making it difficult to scale to large graphs; (2) attention mechanisms based on node-level operations limit the flexibility of the model and result in poor generalization performance in out-of-distribution (OOD) scenarios. To address these issues, we propose \textbf{VecFormer} (the \textbf{Vec}tor Quantized Graph Trans\textbf{former}), an efficient and highly generalizable model for node classification, particularly under OOD settings. VecFormer adopts a two-stage training paradigm. In the first stage, two codebooks are used to reconstruct the node features and the graph structure, aiming to learn the rich semantic \texttt{Graph Codes}. In the second stage, attention mechanisms are performed at the \texttt{Graph Token} level based on the transformed cross codebook, reducing computational complexity while enhancing the model's generalization capability. Extensive experiments on datasets of various sizes demonstrate that VecFormer outperforms the existing Graph Transformer in both performance and speed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes