LGAIROMar 27, 2023

Model-Based Reinforcement Learning with Isolated Imaginations

arXiv:2303.14889v26 citationsh-index: 10
Originality Incremental advance
AI Analysis

This work addresses a practical problem for autonomous driving and similar domains by improving model-based reinforcement learning in environments with mixed dynamics, though it builds incrementally on previous research.

The paper tackles the challenge of learning effective world models in vision-based interactive systems with noncontrollable dynamics, such as autonomous driving, by proposing Iso-Dream++, a model-based reinforcement learning approach that isolates controllable state transitions and performs policy optimization with decoupled latent imaginations. The result is significant outperformance over existing reinforcement learning models on CARLA and DeepMind Control benchmarks.

World models learn the consequences of actions in vision-based interactive systems. However, in practical scenarios like autonomous driving, noncontrollable dynamics that are independent or sparsely dependent on action signals often exist, making it challenging to learn effective world models. To address this issue, we propose Iso-Dream++, a model-based reinforcement learning approach that has two main contributions. First, we optimize the inverse dynamics to encourage the world model to isolate controllable state transitions from the mixed spatiotemporal variations of the environment. Second, we perform policy optimization based on the decoupled latent imaginations, where we roll out noncontrollable states into the future and adaptively associate them with the current controllable state. This enables long-horizon visuomotor control tasks to benefit from isolating mixed dynamics sources in the wild, such as self-driving cars that can anticipate the movement of other vehicles, thereby avoiding potential risks. On top of our previous work, we further consider the sparse dependencies between controllable and noncontrollable states, address the training collapse problem of state decoupling, and validate our approach in transfer learning setups. Our empirical study demonstrates that Iso-Dream++ outperforms existing reinforcement learning models significantly on CARLA and DeepMind Control.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes