AI OCSep 29, 2015

Two Phase $Q-$learning for Bidding-based Vehicle Sharing

arXiv:1509.08932v32.9

Originality Incremental advance

AI Analysis

This addresses the operational inefficiencies in vehicle sharing systems for operators and customers by replacing fixed pricing with a dynamic bidding approach, though it is incremental as it builds on existing auction and reinforcement learning methods.

The paper tackles the problem of optimizing car distribution and service quality in one-way vehicle sharing systems by introducing a bidding-based pricing mechanism, where customers submit bids and the operator decides on rentals, including accepting negative bids to incentivize rebalancing. It models this as a constrained Markov decision problem and proposes a two-phase Q-learning algorithm, with numerical experiments demonstrating its effectiveness.

We consider one-way vehicle sharing systems where customers can rent a car at one station and drop it off at another. The problem we address is to optimize the distribution of cars, and quality of service, by pricing rentals appropriately. We propose a bidding approach that is inspired from auctions and takes into account the significant uncertainty inherent in the problem data (e.g., pick-up and drop-off locations, time of requests, and duration of trips). Specifically, in contrast to current vehicle sharing systems, the operator does not set prices. Instead, customers submit bids and the operator decides whether to rent or not. The operator can even accept negative bids to motivate drivers to rebalance available cars to unpopular destinations within a city. We model the operator's sequential decision-making problem as a \emph{constrained Markov decision problem} (CMDP) and propose and rigorously analyze a novel two phase $Q$-learning algorithm for its solution. Numerical experiments are presented and discussed.

View on arXiv PDF

Similar