LGAIOct 11, 2022

Relational Attention: Generalizing Transformers for Graph-Structured Tasks

Microsoft
arXiv:2210.05062v354 citationsh-index: 11
Originality Highly original
AI Analysis

This addresses a key problem for AI researchers and practitioners working with graph data, offering a novel approach that is not incremental but a significant advancement in graph reasoning.

The paper tackled the limitation of transformers in handling graph-structured data by generalizing transformer attention to update edge vectors, achieving dramatic performance improvements over state-of-the-art graph neural networks on tasks like the CLRS Algorithmic Reasoning Benchmark.

Transformers flexibly operate over sets of real-valued vectors representing task-specific entities and their attributes, where each vector might encode one word-piece token and its position in a sequence, or some piece of information that carries no position at all. But as set processors, transformers are at a disadvantage in reasoning over more general graph-structured data where nodes represent entities and edges represent relations between entities. To address this shortcoming, we generalize transformer attention to consider and update edge vectors in each transformer layer. We evaluate this relational transformer on a diverse array of graph-structured tasks, including the large and challenging CLRS Algorithmic Reasoning Benchmark. There, it dramatically outperforms state-of-the-art graph neural networks expressly designed to reason over graph-structured data. Our analysis demonstrates that these gains are attributable to relational attention's inherent ability to leverage the greater expressivity of graphs over sets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes