LGAIJul 6, 2022

Pure Transformers are Powerful Graph Learners

arXiv:2207.02505v2282 citationsh-index: 27Has Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of graph representation learning for researchers and practitioners by proposing a simpler, theoretically grounded alternative to complex graph-specific models, though it is incremental in leveraging existing Transformer architectures.

The paper tackles the problem of graph learning by demonstrating that standard Transformers, without graph-specific modifications, can achieve strong performance when nodes and edges are treated as independent tokens with appropriate embeddings. The result shows that this approach is theoretically as expressive as invariant graph networks, outperforming message-passing GNNs on a large-scale dataset (PCQM4Mv2) and achieving competitive results against specialized Transformer variants.

We show that standard Transformers without graph-specific modifications can lead to promising results in graph learning both in theory and practice. Given a graph, we simply treat all nodes and edges as independent tokens, augment them with token embeddings, and feed them to a Transformer. With an appropriate choice of token embeddings, we prove that this approach is theoretically at least as expressive as an invariant graph network (2-IGN) composed of equivariant linear layers, which is already more expressive than all message-passing Graph Neural Networks (GNN). When trained on a large-scale graph dataset (PCQM4Mv2), our method coined Tokenized Graph Transformer (TokenGT) achieves significantly better results compared to GNN baselines and competitive results compared to Transformer variants with sophisticated graph-specific inductive bias. Our implementation is available at https://github.com/jw9730/tokengt.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes