RO AI LG MASep 28, 2025

Sequence Pathfinder for Multi-Agent Pickup and Delivery in the Warehouse

Zeyuan Zhao, Chaoran Li, Shao Zhang, Ying Wen

arXiv:2509.23778v23.2h-index: 12

Originality Highly original

AI Analysis

This work addresses the problem of efficient multi-agent task completion in warehouse logistics, offering a novel method that improves performance and scalability, though it is incremental in building on sequence modeling and Transformer paradigms.

The paper tackles the challenge of Multi-Agent Pickup and Delivery (MAPD) in warehouse environments with narrow pathways, where existing learning-based methods perform poorly due to reliance on local observations or high computational complexity from communication. It proposes the Sequential Pathfinder (SePar), which uses a Transformer-based approach to reduce decision-making complexity from exponential to linear, outperforming existing methods and generalizing to unseen environments.

Multi-Agent Pickup and Delivery (MAPD) is a challenging extension of Multi-Agent Path Finding (MAPF), where agents are required to sequentially complete tasks with fixed-location pickup and delivery demands. Although learning-based methods have made progress in MAPD, they often perform poorly in warehouse-like environments with narrow pathways and long corridors when relying only on local observations for distributed decision-making. Communication learning can alleviate the lack of global information but introduce high computational complexity due to point-to-point communication. To address this challenge, we formulate MAPF as a sequence modeling problem and prove that path-finding policies under sequence modeling possess order-invariant optimality, ensuring its effectiveness in MAPD. Building on this, we propose the Sequential Pathfinder (SePar), which leverages the Transformer paradigm to achieve implicit information exchange, reducing decision-making complexity from exponential to linear while maintaining efficiency and global awareness. Experiments demonstrate that SePar consistently outperforms existing learning-based methods across various MAPF tasks and their variants, and generalizes well to unseen environments. Furthermore, we highlight the necessity of integrating imitation learning in complex maps like warehouses.

View on arXiv PDF

Similar