Deep Q-Learning for Same-Day Delivery with Vehicles and Drones
This addresses delivery logistics for e-commerce or services, but it is incremental as it applies existing deep reinforcement learning to a specific operational challenge.
The paper tackles the problem of dynamic same-day delivery using a mixed fleet of vehicles and drones, proposing a deep Q-learning approach to assign customers optimally, and demonstrates superiority over benchmarks in computational experiments.
In this paper, we consider same-day delivery with vehicles and drones. Customers make delivery requests over the course of the day, and the dispatcher dynamically dispatches vehicles and drones to deliver the goods to customers before their delivery deadline. Vehicles can deliver multiple packages in one route but travel relatively slowly due to the urban traffic. Drones travel faster, but they have limited capacity and require charging or battery swaps. To exploit the different strengths of the fleets, we propose a deep Q-learning approach. Our method learns the value of assigning a new customer to either drones or vehicles as well as the option to not offer service at all. In a systematic computational analysis, we show the superiority of our policy compared to benchmark policies and the effectiveness of our deep Q-learning approach. We also show that our policy can maintain effectiveness when the fleet size changes moderately. Experiments on data drawn from varied spatial/temporal distributions demonstrate that our trained policies can cope with changes in the input data.