Object-Centric World Models for Causality-Aware Reinforcement Learning
This work addresses sample efficiency and performance in reinforcement learning for complex, object-rich environments, representing an incremental improvement with a novel integration of object-centric and causality-aware methods.
The paper tackled the challenge of world models in high-dimensional, non-stationary environments with multiple objects by proposing STICA, a framework using object-centric Transformers and causality-aware networks, which outperformed state-of-the-art agents in sample efficiency and final performance on object-rich benchmarks.
World models have been developed to support sample-efficient deep reinforcement learning agents. However, it remains challenging for world models to accurately replicate environments that are high-dimensional, non-stationary, and composed of multiple objects with rich interactions since most world models learn holistic representations of all environmental components. By contrast, humans perceive the environment by decomposing it into discrete objects, facilitating efficient decision-making. Motivated by this insight, we propose \emph{Slot Transformer Imagination with CAusality-aware reinforcement learning} (STICA), a unified framework in which object-centric Transformers serve as the world model and causality-aware policy and value networks. STICA represents each observation as a set of object-centric tokens, together with tokens for the agent action and the resulting reward, enabling the world model to predict token-level dynamics and interactions. The policy and value networks then estimate token-level cause--effect relations and use them in the attention layers, yielding causality-guided decision-making. Experiments on object-rich benchmarks demonstrate that STICA consistently outperforms state-of-the-art agents in both sample efficiency and final performance.