ROLGSYJan 21, 2021

Model-based Policy Search for Partially Measurable Systems

arXiv:2101.08740v1
Originality Incremental advance
AI Analysis

This addresses a specific challenge in robotics or control systems where partial observability is common, representing an incremental improvement over existing GP-based methods.

The paper tackles the problem of model-based reinforcement learning for systems where states are not directly measurable, proposing MC-PILCO4PMS, which explicitly models state observers and uses Gaussian Processes and Monte Carlo methods, achieving effectiveness in simulations and two real systems.

In this paper, we propose a Model-Based Reinforcement Learning (MBRL) algorithm for Partially Measurable Systems (PMS), i.e., systems where the state can not be directly measured, but must be estimated through proper state observers. The proposed algorithm, named Monte Carlo Probabilistic Inference for Learning COntrol for Partially Measurable Systems (MC-PILCO4PMS), relies on Gaussian Processes (GPs) to model the system dynamics, and on a Monte Carlo approach to update the policy parameters. W.r.t. previous GP-based MBRL algorithms, MC-PILCO4PMS models explicitly the presence of state observers during policy optimization, allowing to deal PMS. The effectiveness of the proposed algorithm has been tested both in simulation and in two real systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes