CAMMARL: Conformal Action Modeling in Multi Agent Reinforcement Learning
This addresses the challenge of uncertainty in multi-agent environments for autonomous systems, though it appears incremental as it builds on existing conformal prediction methods.
The paper tackles the problem of autonomous agents reasoning about other agents' actions in multi-agent reinforcement learning by proposing CAMMARL, which uses conformal predictions to model confident sets of actions with high probability guarantees, and demonstrates improved policy learning in cooperative tasks.
Before taking actions in an environment with more than one intelligent agent, an autonomous agent may benefit from reasoning about the other agents and utilizing a notion of a guarantee or confidence about the behavior of the system. In this article, we propose a novel multi-agent reinforcement learning (MARL) algorithm CAMMARL, which involves modeling the actions of other agents in different situations in the form of confident sets, i.e., sets containing their true actions with a high probability. We then use these estimates to inform an agent's decision-making. For estimating such sets, we use the concept of conformal predictions, by means of which, we not only obtain an estimate of the most probable outcome but get to quantify the operable uncertainty as well. For instance, we can predict a set that provably covers the true predictions with high probabilities (e.g., 95%). Through several experiments in two fully cooperative multi-agent tasks, we show that CAMMARL elevates the capabilities of an autonomous agent in MARL by modeling conformal prediction sets over the behavior of other agents in the environment and utilizing such estimates to enhance its policy learning.