LGAIDec 10, 2024

GPT-2 Through the Lens of Vector Symbolic Architectures

Amazon
arXiv:2412.07947v12 citationsh-index: 8
Originality Incremental advance
AI Analysis

This work addresses the problem of understanding transformer model principles for researchers in AI interpretability, though it is incremental as it builds on existing probing and VSA concepts.

The paper investigates whether GPT-2's transformer architecture operates similarly to vector symbolic architectures (VSA), finding that it uses mechanisms like orthogonal vector bundling and binding, which explain a significant portion of its neural weights.

Understanding the general priniciples behind transformer models remains a complex endeavor. Experiments with probing and disentangling features using sparse autoencoders (SAE) suggest that these models might manage linear features embedded as directions in the residual stream. This paper explores the resemblance between decoder-only transformer architecture and vector symbolic architectures (VSA) and presents experiments indicating that GPT-2 uses mechanisms involving nearly orthogonal vector bundling and binding operations similar to VSA for computation and communication between layers. It further shows that these principles help explain a significant portion of the actual neural weights.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes