LG AIApr 3, 2024

Model-based Reinforcement Learning for Parameterized Action Spaces

Renhao Zhang, Haotian Fu, Yilin Miao, George Konidaris

arXiv:2404.03037v311.59 citationsh-index: 49Has CodeICML

Originality Incremental advance

AI Analysis

This addresses the challenge of efficient learning in complex action spaces for robotics or game AI, representing an incremental improvement over existing PAMDP methods.

The paper tackles the problem of reinforcement learning in parameterized action spaces by proposing a model-based algorithm that learns a dynamics model and uses predictive control, achieving superior sample efficiency and asymptotic performance compared to state-of-the-art methods on standard benchmarks.

We propose a novel model-based reinforcement learning algorithm -- Dynamics Learning and predictive control with Parameterized Actions (DLPA) -- for Parameterized Action Markov Decision Processes (PAMDPs). The agent learns a parameterized-action-conditioned dynamics model and plans with a modified Model Predictive Path Integral control. We theoretically quantify the difference between the generated trajectory and the optimal trajectory during planning in terms of the value they achieved through the lens of Lipschitz Continuity. Our empirical results on several standard benchmarks show that our algorithm achieves superior sample efficiency and asymptotic performance than state-of-the-art PAMDP methods.

View on arXiv PDF Code

Similar