Constant Regret Re-solving Heuristics for Price-based Revenue Management
This provides a more efficient solution for retailers managing inventory and pricing over time, though it is incremental as it builds on prior work.
The paper tackles the problem of price-based revenue management by proving that a re-solving heuristic achieves constant regret (O(1)) compared to the optimal policy, improving on a previous O(ln T) bound, and shows an Ω(ln T) gap between the optimal policy and the fluid model.
Price-based revenue management is an important problem in operations management with many practical applications. The problem considers a retailer who sells a product (or multiple products) over $T$ consecutive time periods and is subject to constraints on the initial inventory levels. While the optimal pricing policy could be obtained via dynamic programming, such an approach is sometimes undesirable because of high computational costs. Approximate policies, such as the re-solving heuristics, are often applied as computationally tractable alternatives. In this paper, we show the following two results. First, we prove that a natural re-solving heuristic attains $O(1)$ regret compared to the value of the optimal policy. This improves the $O(\ln T)$ regret upper bound established in the prior work of \cite{jasin2014reoptimization}. Second, we prove that there is an $Ω(\ln T)$ gap between the value of the optimal policy and that of the fluid model. This complements our upper bound result by showing that the fluid is not an adequate information-relaxed benchmark when analyzing price-based revenue management algorithms.