Moving Forward in Formation: A Decentralized Hierarchical Learning Approach to Multi-Agent Moving Together
This addresses the challenge of coordinating mobile robots in warehouses, offering a decentralized solution that improves over existing methods.
The paper tackles the problem of multi-agent path finding in formation by proposing a decentralized hierarchical reinforcement learning approach that decomposes tasks and balances rewards, outperforming end-to-end RL methods and scaling effectively in simulations and real-world scenarios.
Multi-agent path finding in formation has many potential real-world applications like mobile warehouse robots. However, previous multi-agent path finding (MAPF) methods hardly take formation into consideration. Furthermore, they are usually centralized planners and require the whole state of the environment. Other decentralized partially observable approaches to MAPF are reinforcement learning (RL) methods. However, these RL methods encounter difficulties when learning path finding and formation problem at the same time. In this paper, we propose a novel decentralized partially observable RL algorithm that uses a hierarchical structure to decompose the multi objective task into unrelated ones. It also calculates a theoretical weight that makes every task reward has equal influence on the final RL value function. Additionally, we introduce a communication method that helps agents cooperate with each other. Experiments in simulation show that our method outperforms other end-to-end RL methods and our method can naturally scale to large world sizes where centralized planner struggles. We also deploy and validate our method in a real world scenario.