Optimal Contextual Pricing under Agnostic Non-Lipschitz Demand
For researchers in online learning and pricing, this closes a long-standing open regret gap in linear-valuation contextual pricing under agnostic non-Lipschitz noise.
The paper tackles contextual dynamic pricing with linear valuations and agnostic non-Lipschitz noise, achieving $ ilde O(T^{2/3})$ optimal regret, which matches lower bounds and improves over the previous $ ilde O(T^{3/4})$ regret.
We study contextual dynamic pricing with linear valuations and bounded-support agnostic noise, whose induced demand curve may be non-Lipschitz with arbitrary jumps and atoms. Such discontinuities break the cross-context interpolation arguments used by smooth-demand pricing algorithms, while the best previous method achieved only $\tilde O(T^{3/4})$ regret. We propose Conservative-Markdown Redirect-UCB Pricing, a polynomial-time algorithm that combines randomized parameter estimation, conservative residual-grid probing, and confidence-based one-step redirection. Our algorithm achieves $\tilde O(T^{2/3})$ optimal regret, matching the known lower bounds of Kleinberg and Leighton (2003) up to logarithmic factors and improving over the previous upper bound of Xu and Wang (2022). Under stochastic well-conditioned contexts, this closes the long-existing open regret gap in linear-valuation contextual pricing under agnostic non-Lipschitz noise distribution.