LG AIOct 11, 2023

Linear Latent World Models in Simple Transformers: A Case Study on Othello-GPT

Dean S. Hazineh, Zechen Zhang, Jeffery Chiu

arXiv:2310.07582v216.016 citationsh-index: 5Has Code

Originality Synthesis-oriented

AI Analysis

This provides incremental insights into how transformers model game states, relevant for researchers studying emergent world models in AI.

The paper investigates Othello-GPT, a simple transformer trained for Othello, and finds that it develops a linear representation of opposing pieces that causally influences its decision-making, with effects varying by layer depth and model complexity.

Foundation models exhibit significant capabilities in decision-making and logical deductions. Nonetheless, a continuing discourse persists regarding their genuine understanding of the world as opposed to mere stochastic mimicry. This paper meticulously examines a simple transformer trained for Othello, extending prior research to enhance comprehension of the emergent world model of Othello-GPT. The investigation reveals that Othello-GPT encapsulates a linear representation of opposing pieces, a factor that causally steers its decision-making process. This paper further elucidates the interplay between the linear world representation and causal decision-making, and their dependence on layer depth and model complexity. We have made the code public.

View on arXiv PDF Code

Similar