A Reinforcement Learning Approach for Electric Vehicle Routing Problem with Vehicle-to-Grid Supply
This work addresses routing efficiency for fleet operators using electric vehicles, enabling faster decision-making for large-scale problems, though it is incremental as it applies RL to an existing optimization challenge.
The paper tackled the electric vehicle routing problem with vehicle-to-grid supply constraints by developing a reinforcement learning approach called QuikRouteFinder, which achieved results 24 times faster than exact and metaheuristic methods while maintaining solution quality within 20% of optimal.
The use of electric vehicles (EV) in the last mile is appealing from both sustainability and operational cost perspectives. In addition to the inherent cost efficiency of EVs, selling energy back to the grid during peak grid demand, is a potential source of additional revenue to a fleet operator. To achieve this, EVs have to be at specific locations (discharge points) during specific points in time (peak period), even while meeting their core purpose of delivering goods to customers. In this work, we consider the problem of EV routing with constraints on loading capacity; time window; vehicle-to-grid energy supply (CEVRPTW-D); which not only satisfy multiple system objectives, but also scale efficiently to large problem sizes involving hundreds of customers and discharge stations. We present QuikRouteFinder that uses reinforcement learning (RL) for EV routing to overcome these challenges. Using Solomon datasets, results from RL are compared against exact formulations based on mixed-integer linear program (MILP) and genetic algorithm (GA) metaheuristics. On an average, the results show that RL is 24 times faster than MILP and GA, while being close in quality (within 20%) to the optimal.