Smoothness-Adaptive Dynamic Pricing with Nonparametric Demand Learning
This work solves the problem of adaptivity in dynamic pricing for researchers and practitioners, but it is incremental as it builds on existing minimax optimal regret frameworks by introducing a self-similarity condition.
The paper tackles the dynamic pricing problem with nonparametric demand learning by addressing the challenge of adaptivity to the unknown Hölder smoothness parameter β, proving that no policy can achieve minimax optimal regret without knowledge of β, and proposing a self-similarity condition and algorithm that achieves the optimal regret bound of O(T^{(β+1)/(2β+1)}) without prior knowledge of β.
We study the dynamic pricing problem where the demand function is nonparametric and Hölder smooth, and we focus on adaptivity to the unknown Hölder smoothness parameter $β$ of the demand function. Traditionally the optimal dynamic pricing algorithm heavily relies on the knowledge of $β$ to achieve a minimax optimal regret of $\widetilde{O}(T^{\frac{β+1}{2β+1}})$. However, we highlight the challenge of adaptivity in this dynamic pricing problem by proving that no pricing policy can adaptively achieve this minimax optimal regret without knowledge of $β$. Motivated by the impossibility result, we propose a self-similarity condition to enable adaptivity. Importantly, we show that the self-similarity condition does not compromise the problem's inherent complexity since it preserves the regret lower bound $Ω(T^{\frac{β+1}{2β+1}})$. Furthermore, we develop a smoothness-adaptive dynamic pricing algorithm and theoretically prove that the algorithm achieves this minimax optimal regret bound without the prior knowledge $β$.