RO CVSep 25, 2024

Scalable Multi-Robot Informative Path Planning for Target Mapping via Deep Reinforcement Learning

Apoorva Vashisth, Manav Kulshrestha, Damon Conover, Aniket Bera

arXiv:2409.16967v34.11 citationsh-index: 5Has Code

Originality Incremental advance

AI Analysis

This addresses scalable target mapping for autonomous multi-robot systems, offering incremental improvements in efficiency and coordination.

The paper tackles the Multi-Robot Informative Path Planning problem by proposing a deep reinforcement learning approach to maximize discovered stationary targets in unknown 3D environments under resource constraints, achieving at least a 26.2% improvement in discovered targets over state-of-the-art methods with planning times under 2 seconds per step.

Autonomous robots are widely utilized for mapping and exploration tasks due to their cost-effectiveness. Multi-robot systems offer scalability and efficiency, especially in terms of the number of robots deployed in more complex environments. These tasks belong to the set of Multi-Robot Informative Path Planning (MRIPP) problems. In this paper, we propose a deep reinforcement learning approach for the MRIPP problem. We aim to maximize the number of discovered stationary targets in an unknown 3D environment while operating under resource constraints (such as path length). Here, each robot aims to maximize discovered targets, avoid unknown static obstacles, and prevent inter-robot collisions while operating under communication and resource constraints. We utilize the centralized training and decentralized execution paradigm to train a single policy neural network. A key aspect of our approach is our coordination graph that prioritizes visiting regions not yet explored by other robots. Our learned policy can be copied onto any number of robots for deployment in more complex environments not seen during training. Our approach outperforms state-of-the-art approaches by at least 26.2% in terms of the number of discovered targets while requiring a planning time of less than 2 sec per step. We present results for more complex environments with up to 64 robots and compare success rates against baseline planners. Our code and trained model are available at - https://github.com/AccGen99/marl_ipp

View on arXiv PDF Code

Similar