Enhancing Reinforcement Learning Agents with Local Guides
This work addresses performance enhancement in reinforcement learning agents, especially for safety-critical applications, but appears incremental as it builds on existing methods.
The paper tackles the problem of integrating local guide policies into reinforcement learning agents to improve performance, particularly in safety-critical systems, by introducing a novel algorithm based on noisy policy-switching and an Approximate Policy Evaluation scheme, showing efficiency in leveraging these policies across various environments.
This paper addresses the problem of integrating local guide policies into a Reinforcement Learning agent. For this, we show how to adapt existing algorithms to this setting before introducing a novel algorithm based on a noisy policy-switching procedure. This approach builds on a proper Approximate Policy Evaluation (APE) scheme to provide a perturbation that carefully leads the local guides towards better actions. We evaluated our method on a set of classical Reinforcement Learning problems, including safety-critical systems where the agent cannot enter some areas at the risk of triggering catastrophic consequences. In all the proposed environments, our agent proved to be efficient at leveraging those policies to improve the performance of any APE-based Reinforcement Learning algorithm, especially in its first learning stages.