AI LG OCApr 28, 2025

Automated decision-making for dynamic task assignment at scale

Riccardo Lo Bianco, Willem van Jaarsveld, Jeroen Middelhuis, Luca Begnardi, Remco Dijkman

arXiv:2504.19933v15.81 citationsh-index: 17Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of scalable and efficient task assignment in dynamic environments, such as logistics or service industries, though it is incremental by building on existing DRL methods.

The paper tackles the Dynamic Task Assignment Problem (DTAP) at real-world scale by proposing a Deep Reinforcement Learning-based Decision Support System, which matches or outperforms baselines in minimizing average task cycle time across five real-world instances.

The Dynamic Task Assignment Problem (DTAP) concerns matching resources to tasks in real time while minimizing some objectives, like resource costs or task cycle time. In this work, we consider a DTAP variant where every task is a case composed of a stochastic sequence of activities. The DTAP, in this case, involves the decision of which employee to assign to which activity to process requests as quickly as possible. In recent years, Deep Reinforcement Learning (DRL) has emerged as a promising tool for tackling this DTAP variant, but most research is limited to solving small-scale, synthetic problems, neglecting the challenges posed by real-world use cases. To bridge this gap, this work proposes a DRL-based Decision Support System (DSS) for real-world scale DTAPS. To this end, we introduce a DRL agent with two novel elements: a graph structure for observations and actions that can effectively represent any DTAP and a reward function that is provably equivalent to the objective of minimizing the average cycle time of tasks. The combination of these two novelties allows the agent to learn effective and generalizable assignment policies for real-world scale DTAPs. The proposed DSS is evaluated on five DTAP instances whose parameters are extracted from real-world logs through process mining. The experimental evaluation shows how the proposed DRL agent matches or outperforms the best baseline in all DTAP instances and generalizes on different time horizons and across instances.

View on arXiv PDF Code

Similar