Bidirectional Task-Motion Planning Based on Hierarchical Reinforcement Learning for Strategic Confrontation
This addresses the need for efficient decision-making in dynamic swarm robotics scenarios, representing a novel method for a known bottleneck rather than an incremental improvement.
The paper tackles the problem of integrating discrete commands and continuous actions in swarm robotics for strategic confrontation by proposing a bidirectional hierarchical reinforcement learning approach, achieving over 80% win rate and under 0.01 seconds decision time in experiments.
In swarm robotics, confrontation scenarios, including strategic confrontations, require efficient decision-making that integrates discrete commands and continuous actions. Traditional task and motion planning methods separate decision-making into two layers, but their unidirectional structure fails to capture the interdependence between these layers, limiting adaptability in dynamic environments. Here, we propose a novel bidirectional approach based on hierarchical reinforcement learning, enabling dynamic interaction between the layers. This method effectively maps commands to task allocation and actions to path planning, while leveraging cross-training techniques to enhance learning across the hierarchical framework. Furthermore, we introduce a trajectory prediction model that bridges abstract task representations with actionable planning goals. In our experiments, it achieves over 80% in confrontation win rate and under 0.01 seconds in decision time, outperforming existing approaches. Demonstrations through large-scale tests and real-world robot experiments further emphasize the generalization capabilities and practical applicability of our method.