LGAIMLOct 11, 2024

Hierarchical Universal Value Function Approximators

arXiv:2410.08997v21 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses scaling and generalization in hierarchical reinforcement learning, representing an incremental extension of existing UVFA methods.

The paper tackles the problem of scaling universal value function approximators to hierarchical reinforcement learning by introducing hierarchical universal value function approximators (H-UVFAs), which outperform corresponding UVFAs in generalization.

There have been key advancements to building universal approximators for multi-goal collections of reinforcement learning value functions -- key elements in estimating long-term returns of states in a parameterized manner. We extend this to hierarchical reinforcement learning, using the options framework, by introducing hierarchical universal value function approximators (H-UVFAs). This allows us to leverage the added benefits of scaling, planning, and generalization expected in temporal abstraction settings. We develop supervised and reinforcement learning methods for learning embeddings of the states, goals, options, and actions in the two hierarchical value functions: $Q(s, g, o; θ)$ and $Q(s, g, o, a; θ)$. Finally we demonstrate generalization of the HUVFAs and show they outperform corresponding UVFAs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes