AISep 16, 2025

From Next Token Prediction to (STRIPS) World Models -- Preliminary Results

arXiv:2509.13389v32 citationsh-index: 3
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of automated world model learning for AI planning, but it appears incremental as it applies existing deep learning methods to a specific domain.

The paper tackles the problem of learning propositional STRIPS world models from action traces using transformers and gradient descent, showing that a suitable transformer architecture can faithfully represent these models and learn them from random valid and invalid action sequences.

We consider the problem of learning propositional STRIPS world models from action traces alone, using a deep learning architecture (transformers) and gradient descent. The task is cast as a supervised next token prediction problem where the tokens are the actions, and an action $a$ may follow an action sequence if the hidden effects of the previous actions do not make an action precondition of $a$ false. We show that a suitable transformer architecture can faithfully represent propositional STRIPS world models, and that the models can be learned from sets of random valid (positive) and invalid (negative) action sequences alone. A number of experiments are reported.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes