AI LG MA RODec 16, 2022

An Energy-aware and Fault-tolerant Deep Reinforcement Learning based approach for Multi-agent Patrolling Problems

Chenhao Tong, Aaron Harwood, Maria A. Rodriguez, Richard O. Sinnott

arXiv:2212.08230v42.52 citationsh-index: 43

Originality Synthesis-oriented

AI Analysis

This work addresses patrolling problems for autonomous vehicles, but it is incremental as it applies existing reinforcement learning methods to a specific domain with added features like energy management and fault tolerance.

The authors tackled the challenge of multi-agent patrolling in complex environments with unknown factors and agent failures by proposing a deep multi-agent reinforcement learning approach, resulting in a system that enables agents to autonomously recharge and coordinate, validated through simulations for performance, efficiency, fault tolerance, and cooperation.

Autonomous vehicles are suited for continuous area patrolling problems. However, finding an optimal patrolling strategy can be challenging for many reasons. Firstly, patrolling environments are often complex and can include unknown environmental factors, such as wind or landscape. Secondly, autonomous vehicles can have failures or hardware constraints, such as limited battery life. Importantly, patrolling large areas often requires multiple agents that need to collectively coordinate their actions. In this work, we consider these limitations and propose an approach based on model-free, deep multi-agent reinforcement learning. In this approach, the agents are trained to patrol an environment with various unknown dynamics and factors. They can automatically recharge themselves to support continuous collective patrolling. A distributed homogeneous multi-agent architecture is proposed, where all patrolling agents execute identical policies locally based on their local observations and shared location information. This architecture provides a patrolling system that can tolerate agent failures and allow supplementary agents to be added to replace failed agents or to increase the overall patrol performance. The solution is validated through simulation experiments from multiple perspectives, including the overall patrol performance, the efficiency of battery recharging strategies, the overall fault tolerance, and the ability to cooperate with supplementary agents.

View on arXiv PDF

Similar