LGMay 25

Latent Representation Alignment for Offline Goal-Conditioned Reinforcement Learning

Hyungkyu Kang, Byeongchan Kim, Min-hwan Oh

arXiv:2605.2574014.1Has Code

Predicted impact top 49% in LG · last 90 daysOriginality Incremental advance

AI Analysis

For researchers in offline goal-conditioned reinforcement learning, this work addresses a fundamental bottleneck and provides a practical algorithm that substantially improves performance on long-horizon tasks.

The paper identifies erroneous generalization in goal-conditioned value functions as a key bottleneck in offline GCRL, and proposes LAVL, which integrates latent-representation-based value generalization with hierarchical planning. LAVL achieves the highest performance on 20 out of 22 datasets in OGBench, significantly outperforming prior methods, especially in long-horizon tasks.

Offline goal-conditioned reinforcement learning (GCRL) provides a practical framework for obtaining goal-reaching policies from fixed datasets. However, learning a reliable goal-conditioned value function in long-horizon tasks remains challenging. In this paper, we identify erroneous generalization in goal-conditioned value functions as a fundamental bottleneck, and demonstrate that appropriate inductive bias in the value function is crucial for addressing the bottleneck. Building on these findings, we propose Latent-Aligned Value Learning (LAVL), an offline GCRL algorithm that integrates latent-representation-based value generalization with hierarchical planning in a unified framework. Extensive experiments on OGBench demonstrate that LAVL consistently outperforms existing offline GCRL methods, achieving the highest performance on 20 out of 22 datasets. Notably, LAVL exhibits strong performance in long-horizon tasks and trajectory stitching datasets, where prior methods suffer significant performance degradation. Our code is available at https://github.com/oh-lab/LAVL.git.

View on arXiv PDF Code

Similar