Hierarchically Structured Scheduling and Execution of Tasks in a Multi-Agent Environment
This addresses inefficiencies in warehouse logistics for industries relying on dynamic task management, though it is incremental in applying reinforcement learning to hierarchical structures.
The paper tackles dynamic task scheduling and execution in a warehouse multi-agent environment, proposing deep reinforcement learning to handle both centralized scheduling and decentralized execution, with results showing improved efficiency over traditional methods.
In a warehouse environment, tasks appear dynamically. Consequently, a task management system that matches them with the workforce too early (e.g., weeks in advance) is necessarily sub-optimal. Also, the rapidly increasing size of the action space of such a system consists of a significant problem for traditional schedulers. Reinforcement learning, however, is suited to deal with issues requiring making sequential decisions towards a long-term, often remote, goal. In this work, we set ourselves on a problem that presents itself with a hierarchical structure: the task-scheduling, by a centralised agent, in a dynamic warehouse multi-agent environment and the execution of one such schedule, by decentralised agents with only partial observability thereof. We propose to use deep reinforcement learning to solve both the high-level scheduling problem and the low-level multi-agent problem of schedule execution. Finally, we also conceive the case where centralisation is impossible at test time and workers must learn how to cooperate in executing the tasks in an environment with no schedule and only partial observability.