Hierarchical Policy-Gradient Reinforcement Learning for Multi-Agent Shepherding Control of Non-Cohesive Targets
This addresses the shepherding control problem for robotics and autonomous systems, but it is incremental as it builds on existing reinforcement learning methods with a novel integration.
The paper tackles the multi-agent shepherding problem for non-cohesive targets by proposing a decentralized reinforcement learning solution using policy-gradient methods, which overcomes discrete-action constraints of prior approaches and enables smoother trajectories. Experiments show the method is effective and scalable with increased target numbers and limited sensing.
We propose a decentralized reinforcement learning solution for multi-agent shepherding of non-cohesive targets using policy-gradient methods. Our architecture integrates target-selection with target-driving through Proximal Policy Optimization, overcoming discrete-action constraints of previous Deep Q-Network approaches and enabling smoother agent trajectories. This model-free framework effectively solves the shepherding problem without prior dynamics knowledge. Experiments demonstrate our method's effectiveness and scalability with increased target numbers and limited sensing capabilities.