Winning at Any Cost -- Infringing the Cartel Prohibition With Reinforcement Learning
This addresses the risk of AI-driven price-fixing in markets, which is an incremental but important concern for regulators and businesses.
The paper investigates whether reinforcement learning agents in e-commerce pricing can inadvertently form collusive cartels, using a modified prisoner's dilemma with rock-paper-scissors to show that agents can develop tacit cooperation strategies without explicit training, enabling the identification of stages for collusion prevention.
Pricing decisions are increasingly made by AI. Thanks to their ability to train with live market data while making decisions on the fly, deep reinforcement learning algorithms are especially effective in taking such pricing decisions. In e-commerce scenarios, multiple reinforcement learning agents can set prices based on their competitor's prices. Therefore, research states that agents might end up in a state of collusion in the long run. To further analyze this issue, we build a scenario that is based on a modified version of a prisoner's dilemma where three agents play the game of rock paper scissors. Our results indicate that the action selection can be dissected into specific stages, establishing the possibility to develop collusion prevention systems that are able to recognize situations which might lead to a collusion between competitors. We furthermore provide evidence for a situation where agents are capable of performing a tacit cooperation strategy without being explicitly trained to do so.