Real-Time Bidding with Multi-Agent Reinforcement Learning in Display Advertising
This work addresses the challenge for advertisers in display advertising to strategically bid in real-time markets to maximize revenue and ROI, representing an incremental improvement with domain-specific applications.
The paper tackles the problem of optimizing real-time bidding in display advertising by formulating it as a multi-agent reinforcement learning task, proposing a clustering method and distributed coordinated bidding to balance competition and cooperation among advertisers. Results show that cluster-based bidding outperforms single-agent and bandit approaches, and coordinated bidding achieves better overall objectives than purely self-interested agents.
Real-time advertising allows advertisers to bid for each impression for a visiting user. To optimize specific goals such as maximizing revenue and return on investment (ROI) led by ad placements, advertisers not only need to estimate the relevance between the ads and user's interests, but most importantly require a strategic response with respect to other advertisers bidding in the market. In this paper, we formulate bidding optimization with multi-agent reinforcement learning. To deal with a large number of advertisers, we propose a clustering method and assign each cluster with a strategic bidding agent. A practical Distributed Coordinated Multi-Agent Bidding (DCMAB) has been proposed and implemented to balance the tradeoff between the competition and cooperation among advertisers. The empirical study on our industry-scaled real-world data has demonstrated the effectiveness of our methods. Our results show cluster-based bidding would largely outperform single-agent and bandit approaches, and the coordinated bidding achieves better overall objectives than purely self-interested bidding agents.