ROSYOct 19, 2020

Learning a Low-dimensional Representation of a Safe Region for Safe Reinforcement Learning on Dynamical Systems

arXiv:2010.09555v217 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of reducing manual effort in safe reinforcement learning for complex dynamical systems, such as robotics, though it appears incremental as it builds on existing frameworks.

The paper tackles the challenge of efficiently learning a low-dimensional representation of safe regions for safe reinforcement learning on high-dimensional nonlinear dynamical systems, proposing a data-driven approach that updates this representation online using feedback data, and demonstrates with a quadcopter example that it yields a more reliable and representative representation compared to previous work.

For safely applying reinforcement learning algorithms on high-dimensional nonlinear dynamical systems, a simplified system model is used to formulate a safe reinforcement learning framework. Based on the simplified system model, a low-dimensional representation of the safe region is identified and is used to provide safety estimates for learning algorithms. However, finding a satisfying simplified system model for complex dynamical systems usually requires a considerable amount of effort. To overcome this limitation, we propose in this work a general data-driven approach that is able to efficiently learn a low-dimensional representation of the safe region. Through an online adaptation method, the low-dimensional representation is updated by using the feedback data such that more accurate safety estimates are obtained. The performance of the proposed approach for identifying the low-dimensional representation of the safe region is demonstrated with a quadcopter example. The results show that, compared to previous work, a more reliable and representative low-dimensional representation of the safe region is derived, which then extends the applicability of the safe reinforcement learning framework.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes