FoX: Formation-aware exploration in multi-agent reinforcement learning
This addresses the scalability problem in MARL exploration for researchers and practitioners, though it is incremental as it builds on existing MARL methods.
The paper tackles the exploration challenge in multi-agent reinforcement learning (MARL) by proposing a formation-aware exploration (FoX) framework that reduces the exponential search space through formation-based equivalence, focusing on meaningful states. It shows significant performance improvements, outperforming state-of-the-art MARL algorithms on Google Research Football and sparse Starcraft II multi-agent challenge tasks.
Recently, deep multi-agent reinforcement learning (MARL) has gained significant popularity due to its success in various cooperative multi-agent tasks. However, exploration still remains a challenging problem in MARL due to the partial observability of the agents and the exploration space that can grow exponentially as the number of agents increases. Firstly, in order to address the scalability issue of the exploration space, we define a formation-based equivalence relation on the exploration space and aim to reduce the search space by exploring only meaningful states in different formations. Then, we propose a novel formation-aware exploration (FoX) framework that encourages partially observable agents to visit the states in diverse formations by guiding them to be well aware of their current formation solely based on their own observations. Numerical results show that the proposed FoX framework significantly outperforms the state-of-the-art MARL algorithms on Google Research Football (GRF) and sparse Starcraft II multi-agent challenge (SMAC) tasks.