LG AIOct 26, 2022

Knowledge-Guided Exploration in Deep Reinforcement Learning

Sahisnu Mazumder, Bing Liu, Shuai Wang, Yingxuan Zhu, Xiaotian Yin, Lifeng Liu, Jian Li

arXiv:2210.15670v14.66 citationsh-index: 30

Originality Incremental advance

AI Analysis

This addresses efficiency issues for researchers and practitioners using deep RL in domains with permissibility constraints, though it is incremental as it builds on existing algorithms.

The paper tackles the problem of slow deep reinforcement learning training by introducing a state-action permissibility property to guide exploration, resulting in markedly faster training speeds.

This paper proposes a new method to drastically speed up deep reinforcement learning (deep RL) training for problems that have the property of state-action permissibility (SAP). Two types of permissibility are defined under SAP. The first type says that after an action $a_t$ is performed in a state $s_t$ and the agent has reached the new state $s_{t+1}$, the agent can decide whether $a_t$ is permissible or not permissible in $s_t$. The second type says that even without performing $a_t$ in $s_t$, the agent can already decide whether $a_t$ is permissible or not in $s_t$. An action is not permissible in a state if the action can never lead to an optimal solution and thus should not be tried (over and over again). We incorporate the proposed SAP property and encode action permissibility knowledge into two state-of-the-art deep RL algorithms to guide their state-action exploration together with a virtual stopping strategy. Results show that the SAP-based guidance can markedly speed up RL training.

View on arXiv PDF

Similar