LGAIJan 30

Action-Sufficient Goal Representations

arXiv:2601.22496v1h-index: 7
Originality Incremental advance
AI Analysis

This addresses a critical design issue in hierarchical reinforcement learning for long-horizon tasks, though it is incremental as it builds on existing GCRL frameworks.

The paper tackles the problem of goal representation in hierarchical offline goal-conditioned reinforcement learning, showing that value-sufficient representations can fail for optimal control, and introduces action sufficiency, with actor-derived representations outperforming value-based ones on a benchmark.

Hierarchical policies in offline goal-conditioned reinforcement learning (GCRL) addresses long-horizon tasks by decomposing control into high-level subgoal planning and low-level action execution. A critical design choice in such architectures is the goal representation-the compressed encoding of goals that serves as the interface between these levels. Existing approaches commonly derive goal representations while learning value functions, implicitly assuming that preserving information sufficient for value estimation is adequate for optimal control. We show that this assumption can fail, even when the value estimation is exact, as such representations may collapse goal states that need to be differentiated for action learning. To address this, we introduce an information-theoretic framework that defines action sufficiency, a condition on goal representations necessary for optimal action selection. We prove that value sufficiency does not imply action sufficiency and empirically verify that the latter is more strongly associated with control success in a discrete environment. We further demonstrate that standard log-loss training of low-level policies naturally induces action-sufficient representations. Our experimental results a popular benchmark demonstrate that our actor-derived representations consistently outperform representations learned via value estimation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes