LGAIAug 8, 2025

GCHR : Goal-Conditioned Hindsight Regularization for Sample-Efficient Reinforcement Learning

arXiv:2508.06108v11 citationsh-index: 19
Originality Incremental advance
AI Analysis

This work addresses sample efficiency for reinforcement learning practitioners in goal-conditioned settings, though it appears incremental as it builds on existing hindsight and self-imitation techniques.

The paper tackles the challenge of sample efficiency in goal-conditioned reinforcement learning with sparse rewards by proposing Hindsight Goal-conditioned Regularization (HGR) combined with hindsight self-imitation regularization, which achieves substantially more efficient sample reuse and best performances on navigation and manipulation tasks.

Goal-conditioned reinforcement learning (GCRL) with sparse rewards remains a fundamental challenge in reinforcement learning. While hindsight experience replay (HER) has shown promise by relabeling collected trajectories with achieved goals, we argue that trajectory relabeling alone does not fully exploit the available experiences in off-policy GCRL methods, resulting in limited sample efficiency. In this paper, we propose Hindsight Goal-conditioned Regularization (HGR), a technique that generates action regularization priors based on hindsight goals. When combined with hindsight self-imitation regularization (HSR), our approach enables off-policy RL algorithms to maximize experience utilization. Compared to existing GCRL methods that employ HER and self-imitation techniques, our hindsight regularizations achieve substantially more efficient sample reuse and the best performances, which we empirically demonstrate on a suite of navigation and manipulation tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes