AI LGFeb 18, 2020

MoTiAC: Multi-Objective Actor-Critics for Real-Time Bidding

Haolin Zhou, Chaoqi Yang, Xiaofeng Gao, Qiong Chen, Gongshen Liu, Guihai Chen

arXiv:2002.07408v29.58 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of optimizing bidding strategies for advertisers and ad platforms by handling multiple goals simultaneously, representing an incremental advancement in applying reinforcement learning to real-time bidding.

The paper tackles the challenge of balancing multiple objectives in real-time bidding for online advertising by proposing MoTiAC, a multi-objective reinforcement learning algorithm, which achieves improved performance on a large-scale commercial dataset compared to recent approaches.

Online Real-Time Bidding (RTB) is a complex auction game among which advertisers struggle to bid for ad impressions when a user request occurs. Considering display cost, Return on Investment (ROI), and other influential Key Performance Indicators (KPIs), large ad platforms try to balance the trade-off among various goals in dynamics. To address the challenge, we propose a Multi-ObjecTive Actor-Critics algorithm based on reinforcement learning (RL), named MoTiAC, for the problem of bidding optimization with various goals. In MoTiAC, objective-specific agents update the global network asynchronously with different goals and perspectives, leading to a robust bidding policy. Unlike previous RL models, the proposed MoTiAC can simultaneously fulfill multi-objective tasks in complicated bidding environments. In addition, we mathematically prove that our model will converge to Pareto optimality. Finally, experiments on a large-scale real-world commercial dataset from Tencent verify the effectiveness of MoTiAC versus a set of recent approaches

View on arXiv PDF

Similar