LGMLDec 1, 2019

Adversary A3C for Robust Reinforcement Learning

arXiv:1912.00330v128 citations
Originality Incremental advance
AI Analysis

This addresses robustness issues in reinforcement learning for tasks like Atari games and robot control, but it is incremental as it builds on the existing A3C method.

The paper tackles the problem of reinforcement learning agents being vulnerable to noise and disturbances by proposing Adversary Robust A3C (AR-A3C), which introduces an adversarial agent during training to improve robustness, resulting in outperformance over A3C in both clean and noisy environments.

Asynchronous Advantage Actor Critic (A3C) is an effective Reinforcement Learning (RL) algorithm for a wide range of tasks, such as Atari games and robot control. The agent learns policies and value function through trial-and-error interactions with the environment until converging to an optimal policy. Robustness and stability are critical in RL; however, neural network can be vulnerable to noise from unexpected sources and is not likely to withstand very slight disturbances. We note that agents generated from mild environment using A3C are not able to handle challenging environments. Learning from adversarial examples, we proposed an algorithm called Adversary Robust A3C (AR-A3C) to improve the agent's performance under noisy environments. In this algorithm, an adversarial agent is introduced to the learning process to make it more robust against adversarial disturbances, thereby making it more adaptive to noisy environments. Both simulations and real-world experiments are carried out to illustrate the stability of the proposed algorithm. The AR-A3C algorithm outperforms A3C in both clean and noisy environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes