QUANT-PHAILGOCAug 7, 2025

Quantum-Efficient Reinforcement Learning Solutions for Last-Mile On-Demand Delivery

arXiv:2508.09183v21 citationsh-index: 12025 IEEE International Conference on Quantum Artificial Intelligence (QAI)
Originality Incremental advance
AI Analysis

This work addresses optimization challenges in logistics and delivery services, offering a quantum-enhanced solution that is incremental by combining RL with quantum computing for a specific domain problem.

The paper tackles the large-scale Capacitated Pickup and Delivery Problem with Time Windows (CPDPTW) for last-mile on-demand delivery by proposing a Reinforcement Learning framework augmented with a Parametrized Quantum Circuit, resulting in minimized travel time and improved scalability and training complexity compared to classical methods like PPO and QSVT.

Quantum computation has demonstrated a promising alternative to solving the NP-hard combinatorial problems. Specifically, when it comes to optimization, classical approaches become intractable to account for large-scale solutions. Specifically, we investigate quantum computing to solve the large-scale Capacitated Pickup and Delivery Problem with Time Windows (CPDPTW). In this regard, a Reinforcement Learning (RL) framework augmented with a Parametrized Quantum Circuit (PQC) is designed to minimize the travel time in a realistic last-mile on-demand delivery. A novel problem-specific encoding quantum circuit with an entangling and variational layer is proposed. Moreover, Proximal Policy Optimization (PPO) and Quantum Singular Value Transformation (QSVT) are designed for comparison through numerical experiments, highlighting the superiority of the proposed method in terms of the scale of the solution and training complexity while incorporating the real-world constraints.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes