AI LG MAMar 24, 2023

Causality Detection for Efficient Multi-Agent Reinforcement Learning

Rafael Pina, Varuna De Silva, Corentin Artaud

arXiv:2303.14227v12.11 citationsh-index: 11

Originality Incremental advance

AI Analysis

This addresses inefficiencies in MARL for teams, though it is incremental by building on existing causality concepts.

The paper tackles the problem of lazy agents in multi-agent reinforcement learning by formalizing temporal causality to penalize suboptimal behaviors, resulting in improved holistic team performance and individual agent capabilities across different environments.

When learning a task as a team, some agents in Multi-Agent Reinforcement Learning (MARL) may fail to understand their true impact in the performance of the team. Such agents end up learning sub-optimal policies, demonstrating undesired lazy behaviours. To investigate this problem, we start by formalising the use of temporal causality applied to MARL problems. We then show how causality can be used to penalise such lazy agents and improve their behaviours. By understanding how their local observations are causally related to the team reward, each agent in the team can adjust their individual credit based on whether they helped to cause the reward or not. We show empirically that using causality estimations in MARL improves not only the holistic performance of the team, but also the individual capabilities of each agent. We observe that the improvements are consistent in a set of different environments.

View on arXiv PDF

Similar