LG AI SY MLSep 30, 2022

Safe Exploration Method for Reinforcement Learning under Existence of Disturbance

Yoshihiro Okawa, Tomotake Sasaki, Hitoshi Yanami, Toru Namerikawa

arXiv:2209.15452v25.85 citationsh-index: 24Has Code

Originality Incremental advance

AI Analysis

This addresses safety-critical applications in real environments, such as robotics, by ensuring safe learning despite disturbances, though it is incremental as it builds on prior safe RL methods.

The paper tackles the problem of safe exploration in reinforcement learning under stochastic disturbances by proposing a method that guarantees satisfaction of explicit state constraints with a pre-specified probability, validated through numerical simulations on an inverted pendulum and a robot manipulator.

Recent rapid developments in reinforcement learning algorithms have been giving us novel possibilities in many fields. However, due to their exploring property, we have to take the risk into consideration when we apply those algorithms to safety-critical problems especially in real environments. In this study, we deal with a safe exploration problem in reinforcement learning under the existence of disturbance. We define the safety during learning as satisfaction of the constraint conditions explicitly defined in terms of the state and propose a safe exploration method that uses partial prior knowledge of a controlled object and disturbance. The proposed method assures the satisfaction of the explicit state constraints with a pre-specified probability even if the controlled object is exposed to a stochastic disturbance following a normal distribution. As theoretical results, we introduce sufficient conditions to construct conservative inputs not containing an exploring aspect used in the proposed method and prove that the safety in the above explained sense is guaranteed with the proposed method. Furthermore, we illustrate the validity and effectiveness of the proposed method through numerical simulations of an inverted pendulum and a four-bar parallel link robot manipulator.

View on arXiv PDF Code

Similar