Tackling Uncertainties in Multi-Agent Reinforcement Learning through Integration of Agent Termination Dynamics
This work addresses safety and efficiency issues in multi-agent systems, offering an incremental improvement for applications like robotics or gaming.
The paper tackled the challenge of uncertainty and risk in cooperative multi-agent reinforcement learning by integrating distributional learning with a safety-focused loss function, resulting in improved convergence and outperforming state-of-the-art baselines in safety and task completion on the StarCraft II benchmark.
Multi-Agent Reinforcement Learning (MARL) has gained significant traction for solving complex real-world tasks, but the inherent stochasticity and uncertainty in these environments pose substantial challenges to efficient and robust policy learning. While Distributional Reinforcement Learning has been successfully applied in single-agent settings to address risk and uncertainty, its application in MARL is substantially limited. In this work, we propose a novel approach that integrates distributional learning with a safety-focused loss function to improve convergence in cooperative MARL tasks. Specifically, we introduce a Barrier Function based loss that leverages safety metrics, identified from inherent faults in the system, into the policy learning process. This additional loss term helps mitigate risks and encourages safer exploration during the early stages of training. We evaluate our method in the StarCraft II micromanagement benchmark, where our approach demonstrates improved convergence and outperforms state-of-the-art baselines in terms of both safety and task completion. Our results suggest that incorporating safety considerations can significantly enhance learning performance in complex, multi-agent environments.