Efficient Multi-robot Exploration via Multi-head Attention-based Cooperation Strategy
This addresses scalability and efficiency issues in multi-robot systems for applications like search and rescue, though it is an incremental improvement over existing learning-based methods.
The paper tackles the problem of decentralized multi-robot exploration in continuous space by proposing MAMECS, which uses multi-head attention to reduce state space and learn cooperation strategies, achieving a superior exploration rate.
The goal of coordinated multi-robot exploration tasks is to employ a team of autonomous robots to explore an unknown environment as quickly as possible. Compared with human-designed methods, which began with heuristic and rule-based approaches, learning-based methods enable individual robots to learn sophisticated and hard-to-design cooperation strategies through deep reinforcement learning technologies. However, in decentralized multi-robot exploration tasks, learning-based algorithms are still far from being universally applicable to the continuous space due to the difficulties associated with area calculation and reward function designing; moreover, existing learning-based methods encounter problems when attempting to balance the historical trajectory issue and target area conflict problem. Furthermore, the scalability of these methods to a large number of agents is poor because of the exponential explosion problem of state space. Accordingly, this paper proposes a novel approach - Multi-head Attention-based Multi-robot Exploration in Continuous Space (MAMECS) - aimed at reducing the state space and automatically learning the cooperation strategies required for decentralized multi-robot exploration tasks in continuous space. Computational geometry knowledge is applied to describe the environment in continuous space and to design an improved reward function to ensure a superior exploration rate. Moreover, the multi-head attention mechanism employed helps to solve the historical trajectory issue in the decentralized multi-robot exploration task, as well as to reduce the quadratic increase of action space.