LGAIROMay 19, 2025

Temporal Distance-aware Transition Augmentation for Offline Model-based Reinforcement Learning

arXiv:2505.13144v15 citationsh-index: 5ICML
Originality Highly original
AI Analysis

This addresses performance degradation in offline RL for robotics and simulation tasks, but it is incremental as it builds on existing MBRL approaches with a novel augmentation technique.

The paper tackles the problem of offline model-based reinforcement learning struggling in sparse-reward, long-horizon tasks by introducing TempDATA, a framework that generates augmented transitions in a temporally structured latent space, resulting in outperforming previous methods and matching or surpassing diffusion-based and goal-conditioned RL on benchmarks like D4RL AntMaze and FrankaKitchen.

The goal of offline reinforcement learning (RL) is to extract a high-performance policy from the fixed datasets, minimizing performance degradation due to out-of-distribution (OOD) samples. Offline model-based RL (MBRL) is a promising approach that ameliorates OOD issues by enriching state-action transitions with augmentations synthesized via a learned dynamics model. Unfortunately, seminal offline MBRL methods often struggle in sparse-reward, long-horizon tasks. In this work, we introduce a novel MBRL framework, dubbed Temporal Distance-Aware Transition Augmentation (TempDATA), that generates augmented transitions in a temporally structured latent space rather than in raw state space. To model long-horizon behavior, TempDATA learns a latent abstraction that captures a temporal distance from both trajectory and transition levels of state space. Our experiments confirm that TempDATA outperforms previous offline MBRL methods and achieves matching or surpassing the performance of diffusion-based trajectory augmentation and goal-conditioned RL on the D4RL AntMaze, FrankaKitchen, CALVIN, and pixel-based FrankaKitchen.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes