Autobidders with Budget and ROI Constraints: Efficiency, Regret, and Pacing Dynamics
This addresses efficiency and regret in online advertising auctions for platforms and advertisers, representing a strong incremental improvement with specific theoretical guarantees.
The paper tackles the problem of autobidding algorithms in online advertising by proposing a gradient-based learning algorithm that ensures budget and ROI constraints, achieves vanishing individual regret, and guarantees that the expected liquid welfare is at least half of the optimal, regardless of equilibrium convergence.
We study a game between autobidding algorithms that compete in an online advertising platform. Each autobidder is tasked with maximizing its advertiser's total value over multiple rounds of a repeated auction, subject to budget and return-on-investment constraints. We propose a gradient-based learning algorithm that is guaranteed to satisfy all constraints and achieves vanishing individual regret. Our algorithm uses only bandit feedback and can be used with the first- or second-price auction, as well as with any "intermediate" auction format. Our main result is that when these autobidders play against each other, the resulting expected liquid welfare over all rounds is at least half of the expected optimal liquid welfare achieved by any allocation. This holds whether or not the bidding dynamics converges to an equilibrium.