LGAIMay 10, 2024

Learning Latent Dynamic Robust Representations for World Models

arXiv:2405.06263v220 citationsh-index: 13Has CodeICML
AI Analysis

This addresses the issue of robustness in MBRL for agents in noisy visual environments, though it appears incremental as it builds on existing methods like Dreamer.

The paper tackles the problem of visual Model-Based Reinforcement Learning (MBRL) agents struggling with noisy pixel-based inputs by developing a method to capture task-specific features and filter out irrelevant details, resulting in significant performance improvements in visually complex control tasks with distractors.

Visual Model-Based Reinforcement Learning (MBRL) promises to encapsulate agent's knowledge about the underlying dynamics of the environment, enabling learning a world model as a useful planner. However, top MBRL agents such as Dreamer often struggle with visual pixel-based inputs in the presence of exogenous or irrelevant noise in the observation space, due to failure to capture task-specific features while filtering out irrelevant spatio-temporal details. To tackle this problem, we apply a spatio-temporal masking strategy, a bisimulation principle, combined with latent reconstruction, to capture endogenous task-specific aspects of the environment for world models, effectively eliminating non-essential information. Joint training of representations, dynamics, and policy often leads to instabilities. To further address this issue, we develop a Hybrid Recurrent State-Space Model (HRSSM) structure, enhancing state representation robustness for effective policy learning. Our empirical evaluation demonstrates significant performance improvements over existing methods in a range of visually complex control tasks such as Maniskill \cite{gu2023maniskill2} with exogenous distractors from the Matterport environment. Our code is avaliable at https://github.com/bit1029public/HRSSM.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes