Visualizing and Understanding Atari Agents
This work addresses the interpretability challenge for deep RL agents, which is crucial for developers and researchers, though it is incremental as it builds on existing saliency map techniques.
The paper tackled the problem of understanding deep reinforcement learning agents' decision-making strategies in Atari 2600 environments by introducing a method for generating saliency maps, which revealed what agents attend to, why they make decisions, and how they evolve during learning, with tests showing improved human reasoning about these agents.
While deep reinforcement learning (deep RL) agents are effective at maximizing rewards, it is often unclear what strategies they use to do so. In this paper, we take a step toward explaining deep RL agents through a case study using Atari 2600 environments. In particular, we focus on using saliency maps to understand how an agent learns and executes a policy. We introduce a method for generating useful saliency maps and use it to show 1) what strong agents attend to, 2) whether agents are making decisions for the right or wrong reasons, and 3) how agents evolve during learning. We also test our method on non-expert human subjects and find that it improves their ability to reason about these agents. Overall, our results show that saliency information can provide significant insight into an RL agent's decisions and learning behavior.