LGMay 13

Learning POMDP World Models from Observations with Language-Model Priors

arXiv:2605.1374020.6Has Code
Predicted impact top 27% in LG · last 90 daysOriginality Incremental advance
AI Analysis

For AI agents learning world models under partial observability, this work shows that language-model priors can reduce costly environment interaction, offering a practical tool for sample-efficient learning.

Pinductor uses LLM priors to learn POMDP world models from few observation-action trajectories, matching the performance of methods with privileged state access and surpassing tabular baselines in sample efficiency.

Whether navigating a building, operating a robot, or playing a game, an agent that acts effectively in an environment must first learn an internal model of how that environment works. Partially-observable Markov decision processes (POMDPs) provide a flexible modeling class for such internal world models, but learning them from observation-action trajectories alone is challenging and typically requires extensive environment interaction. We ask whether language-model priors can reduce costly interaction by leveraging prior knowledge, and introduce \emph{Pinductor} (POMDP-inductor): an LLM proposes candidate POMDP models from a few observation-action trajectories and iteratively refines them to optimize a belief-based likelihood score. Despite using strictly less information, \emph{Pinductor} matches the performance and sample efficiency of LLM-based POMDP learning methods that assume privileged access to the hidden state, while significantly surpassing the sample efficiency of tabular POMDP baselines. Further results show that performance scales with LLM capability and degrades gracefully as semantic information about the environment is withheld. Together, these results position language-model priors as a practical tool for sample-efficient world-model learning under partial observability, and a step toward generalist agents in real-world environments. Code is available at https://github.com/atomresearch/pinductor.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes