GT LGJul 8, 2022

Online Learning in Supply-Chain Games

Nicolò Cesa-Bianchi, Tommaso Cesari, Takayuki Osogami, Marco Scarsini, Segev Wasserkrug

arXiv:2207.04054v11.23 citationsh-index: 36

Originality Incremental advance

AI Analysis

This addresses profit optimization in supply-chain management for businesses, but it is incremental as it builds on existing game theory and online learning frameworks.

The paper tackles the problem of maximizing profits in a repeated supplier-retailer game with partial knowledge of demand and cost parameters, showing that natural learning dynamics converge to the Stackelberg equilibrium and providing finite-time regret bounds for the supplier and asymptotic bounds for the retailer.

We study a repeated game between a supplier and a retailer who want to maximize their respective profits without full knowledge of the problem parameters. After characterizing the uniqueness of the Stackelberg equilibrium of the stage game with complete information, we show that even with partial knowledge of the joint distribution of demand and production costs, natural learning dynamics guarantee convergence of the joint strategy profile of supplier and retailer to the Stackelberg equilibrium of the stage game. We also prove finite-time bounds on the supplier's regret and asymptotic bounds on the retailer's regret, where the specific rates depend on the type of knowledge preliminarily available to the players. In the special case when the supplier is not strategic (vertical integration), we prove optimal finite-time regret bounds on the retailer's regret (or, equivalently, the social welfare) when costs and demand are adversarially generated and the demand is censored.

View on arXiv PDF

Similar