Semi-parametric dynamic contextual pricing
This addresses dynamic pricing challenges for e-commerce platforms, offering a novel approach but with incremental improvements in algorithm design.
The paper tackles the problem of revenue-maximization in e-commerce pricing by using contextual information to predict customer valuations, where only binary transaction outcomes are observed, and achieves an algorithm with $ ilde O(\sqrt{n})$ regret. It empirically demonstrates good performance with a scalable implementation.
Motivated by the application of real-time pricing in e-commerce platforms, we consider the problem of revenue-maximization in a setting where the seller can leverage contextual information describing the customer's history and the product's type to predict her valuation of the product. However, her true valuation is unobservable to the seller, only binary outcome in the form of success-failure of a transaction is observed. Unlike in usual contextual bandit settings, the optimal price/arm given a covariate in our setting is sensitive to the detailed characteristics of the residual uncertainty distribution. We develop a semi-parametric model in which the residual distribution is non-parametric and provide the first algorithm which learns both regression parameters and residual distribution with $\tilde O(\sqrt{n})$ regret. We empirically test a scalable implementation of our algorithm and observe good performance.