Impression Allocation and Policy Search in Display Advertising
This addresses a key operational challenge for large publishers in maximizing revenue and impression quality, but it is incremental as it builds on existing auction and reinforcement learning frameworks.
The paper tackles the problem of optimizing impression allocation between guaranteed contracts and real-time bidding for publishers in online display advertising, proposing a multi-agent reinforcement learning method that adjusts bids to achieve optimal outcomes, with experiments on real-world datasets showing effectiveness.
In online display advertising, guaranteed contracts and real-time bidding (RTB) are two major ways to sell impressions for a publisher. For large publishers, simultaneously selling impressions through both guaranteed contracts and in-house RTB has become a popular choice. Generally speaking, a publisher needs to derive an impression allocation strategy between guaranteed contracts and RTB to maximize its overall outcome (e.g., revenue and/or impression quality). However, deriving the optimal strategy is not a trivial task, e.g., the strategy should encourage incentive compatibility in RTB and tackle common challenges in real-world applications such as unstable traffic patterns (e.g., impression volume and bid landscape changing). In this paper, we formulate impression allocation as an auction problem where each guaranteed contract submits virtual bids for individual impressions. With this formulation, we derive the optimal bidding functions for the guaranteed contracts, which result in the optimal impression allocation. In order to address the unstable traffic pattern challenge and achieve the optimal overall outcome, we propose a multi-agent reinforcement learning method to adjust the bids from each guaranteed contract, which is simple, converging efficiently and scalable. The experiments conducted on real-world datasets demonstrate the effectiveness of our method.