Convergence of Stochastic First-Order Algorithms in Bertrand Competition Under Incomplete Information
For researchers and practitioners in algorithmic pricing, this work offers a theoretical foundation for convergence of gradient-based learning in incomplete-information markets, countering claims of algorithmic collusion.
The paper proves that Regularized Robbins-Monro algorithms converge almost surely to the unique efficient Bayes-Nash equilibrium in Bayesian Bertrand competition with private costs, despite the setting violating standard stability conditions. This provides rigorous convergence guarantees for stochastic first-order learning in this domain.
Autonomous pricing agents are widely deployed in online marketplaces, making algorithmic pricing a prominent application of multi-agent learning. Experimental studies often report collusive outcomes, but these findings typically rely on Q-learning in complete-information environments and lack rigorous convergence guarantees. In this paper, we study the stochastic learning dynamics of Regularized Robbins-Monro (RRM) algorithms in a Bayesian Bertrand competition with private costs. We show that this setting violates standard stability conditions, including monotonicity and the Minty variational inequality, rendering classical convergence results for gradient-based learning inapplicable. Despite this, we prove that Euclidean RRM algorithms converge almost surely to the unique, efficient Bayes-Nash equilibrium within a finite-dimensional approximation of the strategy space. By analyzing symmetric piecewise-linear pricing strategies in a duopoly, we explicitly construct a global Lyapunov function for the projected primal dynamics and establish global asymptotic stability of the equilibrium. Our analysis yields rigorous convergence guarantees for stochastic first-order learning algorithms in Bayesian Bertrand competition and provides a principled counterpoint to widespread claims of algorithmic collusion.