LGAIOct 11, 2023

Linear Latent World Models in Simple Transformers: A Case Study on Othello-GPT

arXiv:2310.07582v216 citationsh-index: 5
Originality Synthesis-oriented
AI Analysis

This provides incremental insights into how transformers model game states, relevant for researchers studying emergent world models in AI.

The paper investigates Othello-GPT, a simple transformer trained for Othello, and finds that it develops a linear representation of opposing pieces that causally influences its decision-making, with effects varying by layer depth and model complexity.

Foundation models exhibit significant capabilities in decision-making and logical deductions. Nonetheless, a continuing discourse persists regarding their genuine understanding of the world as opposed to mere stochastic mimicry. This paper meticulously examines a simple transformer trained for Othello, extending prior research to enhance comprehension of the emergent world model of Othello-GPT. The investigation reveals that Othello-GPT encapsulates a linear representation of opposing pieces, a factor that causally steers its decision-making process. This paper further elucidates the interplay between the linear world representation and causal decision-making, and their dependence on layer depth and model complexity. We have made the code public.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes