LGAIApr 3, 2024

Model-based Reinforcement Learning for Parameterized Action Spaces

arXiv:2404.03037v39 citationsh-index: 49ICML
AI Analysis

This addresses the challenge of efficient learning in complex action spaces for robotics or game AI, representing an incremental improvement over existing PAMDP methods.

The paper tackles the problem of reinforcement learning in parameterized action spaces by proposing a model-based algorithm that learns a dynamics model and uses predictive control, achieving superior sample efficiency and asymptotic performance compared to state-of-the-art methods on standard benchmarks.

We propose a novel model-based reinforcement learning algorithm -- Dynamics Learning and predictive control with Parameterized Actions (DLPA) -- for Parameterized Action Markov Decision Processes (PAMDPs). The agent learns a parameterized-action-conditioned dynamics model and plans with a modified Model Predictive Path Integral control. We theoretically quantify the difference between the generated trajectory and the optimal trajectory during planning in terms of the value they achieved through the lens of Lipschitz Continuity. Our empirical results on several standard benchmarks show that our algorithm achieves superior sample efficiency and asymptotic performance than state-of-the-art PAMDP methods.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes