Self-Attention in Colors: Another Take on Encoding Graph Structure in Transformers
This addresses the challenge of effectively incorporating graph topology into Transformers for domains like molecular graphs, though it appears incremental as it builds on existing graph Transformer methods.
The paper tackles the problem of encoding graph structure in Transformers by introducing a novel self-attention mechanism called CSA, which integrates graph structural information and edge features without local message-passing, achieving state-of-the-art results on the ZINC benchmark dataset.
We introduce a novel self-attention mechanism, which we call CSA (Chromatic Self-Attention), which extends the notion of attention scores to attention _filters_, independently modulating the feature channels. We showcase CSA in a fully-attentional graph Transformer CGT (Chromatic Graph Transformer) which integrates both graph structural information and edge features, completely bypassing the need for local message-passing components. Our method flexibly encodes graph structure through node-node interactions, by enriching the original edge features with a relative positional encoding scheme. We propose a new scheme based on random walks that encodes both structural and positional information, and show how to incorporate higher-order topological information, such as rings in molecular graphs. Our approach achieves state-of-the-art results on the ZINC benchmark dataset, while providing a flexible framework for encoding graph structure and incorporating higher-order topology.