ROAILGDec 1, 2025

Real-World Robot Control by Deep Active Inference With a Temporally Hierarchical World Model

arXiv:2512.01924v12 citationsh-index: 2IEEE Robot Autom Lett
Originality Incremental advance
AI Analysis

This work addresses the challenge of enabling robots to perform both goal-directed and exploratory actions under uncertainty, representing an incremental improvement over existing deep active inference methods.

The paper tackles the problem of robot control in uncertain real-world environments by proposing a deep active inference framework with a temporally hierarchical world model, achieving high success rates in diverse manipulation tasks and enabling computationally tractable action selection.

Robots in uncertain real-world environments must perform both goal-directed and exploratory actions. However, most deep learning-based control methods neglect exploration and struggle under uncertainty. To address this, we adopt deep active inference, a framework that accounts for human goal-directed and exploratory actions. Yet, conventional deep active inference approaches face challenges due to limited environmental representation capacity and high computational cost in action selection. We propose a novel deep active inference framework that consists of a world model, an action model, and an abstract world model. The world model encodes environmental dynamics into hidden state representations at slow and fast timescales. The action model compresses action sequences into abstract actions using vector quantization, and the abstract world model predicts future slow states conditioned on the abstract action, enabling low-cost action selection. We evaluate the framework on object-manipulation tasks with a real-world robot. Results show that it achieves high success rates across diverse manipulation tasks and switches between goal-directed and exploratory actions in uncertain settings, while making action selection computationally tractable. These findings highlight the importance of modeling multiple timescale dynamics and abstracting actions and state transitions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes