Abstractors and relational cross-attention: An inductive bias for explicit relational reasoning in Transformers
This addresses the challenge of relational abstraction and generalization in AI, particularly for tasks requiring explicit reasoning, though it appears incremental as an extension of existing Transformer architectures.
The authors tackled the problem of enabling explicit relational reasoning in Transformers by proposing a novel Abstractor module with relational cross-attention, resulting in dramatic improvements in sample efficiency on relational tasks and consistent gains in mathematical problem-solving.
An extension of Transformers is proposed that enables explicit relational reasoning through a novel module called the Abstractor. At the core of the Abstractor is a variant of attention called relational cross-attention. The approach is motivated by an architectural inductive bias for relational learning that disentangles relational information from object-level features. This enables explicit relational reasoning, supporting abstraction and generalization from limited data. The Abstractor is first evaluated on simple discriminative relational tasks and compared to existing relational architectures. Next, the Abstractor is evaluated on purely relational sequence-to-sequence tasks, where dramatic improvements are seen in sample efficiency compared to standard Transformers. Finally, Abstractors are evaluated on a collection of tasks based on mathematical problem solving, where consistent improvements in performance and sample efficiency are observed.