Abstraction for Offline Goal-Conditioned Reinforcement Learning
For researchers in offline goal-conditioned RL, this work provides a method to improve sample efficiency by reusing experience across similar contexts.
The paper introduces a hierarchical framework with relativised options and distinct representations to enable absolute abstraction in offline goal-conditioned RL, showing significant performance improvements over baselines.
Markov Decision Processes (MDPs) often exhibit significant redundancy due to symmetries and shared structure across state-goal pairs in real-world Goal-Conditioned Reinforcement Learning (GCRL). While hierarchical policies have been motivated for horizon reduction via temporal abstraction in offline GCRL, we demonstrate that hierarchy also enables absolute abstraction. By introducing relativised options as well as distinct representations for different levels of the hierarchy, we demonstrate how an agent can reuse experience across similar contexts of the state-space. Based on this framework, we introduce two simple algorithms for learning relativised options and abstracting from the absolute frame of reference. Our experiments show that such inductive biases significantly improve performance in offline GCRL.