Toward Policy Explanations for Multi-Agent Reinforcement Learning
This work addresses the need for transparency and collaboration in multi-agent systems, such as cooperative AI and autonomous driving, by providing explainable methods tailored to MARL, though it is incremental as it extends single-agent explainability approaches.
The paper tackles the problem of explaining agent decisions in multi-agent reinforcement learning (MARL) by introducing methods for policy summarization and language explanations, with experimental results showing that these explanations significantly improve user performance and satisfaction ratings.
Advances in multi-agent reinforcement learning (MARL) enable sequential decision making for a range of exciting multi-agent applications such as cooperative AI and autonomous driving. Explaining agent decisions is crucial for improving system transparency, increasing user satisfaction, and facilitating human-agent collaboration. However, existing works on explainable reinforcement learning mostly focus on the single-agent setting and are not suitable for addressing challenges posed by multi-agent environments. We present novel methods to generate two types of policy explanations for MARL: (i) policy summarization about the agent cooperation and task sequence, and (ii) language explanations to answer queries about agent behavior. Experimental results on three MARL domains demonstrate the scalability of our methods. A user study shows that the generated explanations significantly improve user performance and increase subjective ratings on metrics such as user satisfaction.