AIMay 13, 2025Code
BAT: Benchmark for Auto-bidding TaskAlexandra Khirianova, Ekaterina Solodneva, Andrey Pudovikov et al.
The optimization of bidding strategies for online advertising slot auctions presents a critical challenge across numerous digital marketplaces. A significant obstacle to the development, evaluation, and refinement of real-time autobidding algorithms is the scarcity of comprehensive datasets and standardized benchmarks. To address this deficiency, we present an auction benchmark encompassing the two most prevalent auction formats. We implement a series of robust baselines on a novel dataset, addressing the most salient Real-Time Bidding (RTB) problem domains: budget pacing uniformity and Cost Per Click (CPC) constraint optimization. This benchmark provides a user-friendly and intuitive framework for researchers and practitioners to develop and refine innovative autobidding algorithms, thereby facilitating advancements in the field of programmatic advertising. The implementation and additional resources can be accessed at the following repository (https://github.com/avito-tech/bat-autobidding-benchmark, https://doi.org/10.5281/zenodo.14794182).
GTOct 22, 2025Code
Autobidding Arena: unified evaluation of the classical and RL-based autobidding algorithmsAndrey Pudovikov, Alexandra Khirianova, Ekaterina Solodneva et al.
Advertisement auctions play a crucial role in revenue generation for e-commerce companies. To make the bidding procedure scalable to thousands of auctions, the automatic bidding (autobidding) algorithms are actively developed in the industry. Therefore, the fair and reproducible evaluation of autobidding algorithms is an important problem. We present a standardized and transparent evaluation protocol for comparing classical and reinforcement learning (RL) autobidding algorithms. We consider the most efficient autobidding algorithms from different classes, e.g., ones based on the controllers, RL, optimal formulas, etc., and benchmark them in the bidding environment. We utilize the most recent open-source environment developed in the industry, which accurately emulates the bidding process. Our work demonstrates the most promising use cases for the considered autobidding algorithms, highlights their surprising drawbacks, and evaluates them according to multiple metrics. We select the evaluation metrics that illustrate the performance of the autobidding algorithms, the corresponding costs, and track the budget pacing. Such a choice of metrics makes our results applicable to the broad range of platforms where autobidding is effective. The presented comparison results help practitioners to evaluate the candidate autobidding algorithms from different perspectives and select ones that are efficient according to their companies' targets.
LGMar 2
Uncertainty Quantification of Click and Conversion Estimates for the AutobiddingIvan Zhigalskii, Andrey Pudovikov, Aleksandr Katrutsa et al.
Modern e-commerce platforms employ various auction mechanisms to allocate paid slots for a given item. To scale this approach to the millions of auctions, the platforms suggest promotion tools based on the autobidding algorithms. These algorithms typically depend on the Click-Through-Rate (CTR) and Conversion-Rate (CVR) estimates provided by a pre-trained machine learning model. However, the predictions of such models are uncertain and can significantly affect the performance of the autobidding algorithm. To address this issue, we propose the DenoiseBid method, which corrects the generated CTRs and CVRs to make the resulting bids more efficient in auctions. The underlying idea of our method is to employ a Bayesian approach and replace noisy CTR or CVR estimates with those from recovered distributions. To demonstrate the performance of the proposed approach, we perform extensive experiments on the synthetic, iPinYou, and BAT datasets. To evaluate the robustness of our approach to the noise scale, we use synthetic noise and noise estimated from the predictions of the pre-trained machine learning model.
LGFeb 10, 2024
Fast UCB-type algorithms for stochastic bandits with heavy and super heavy symmetric noiseYuriy Dorn, Aleksandr Katrutsa, Ilgam Latypov et al.
In this study, we propose a new method for constructing UCB-type algorithms for stochastic multi-armed bandits based on general convex optimization methods with an inexact oracle. We derive the regret bounds corresponding to the convergence rates of the optimization methods. We propose a new algorithm Clipped-SGD-UCB and show, both theoretically and empirically, that in the case of symmetric noise in the reward, we can achieve an $O(\log T\sqrt{KT\log T})$ regret bound instead of $O\left (T^{\frac{1}{1+α}} K^{\fracα{1+α}} \right)$ for the case when the reward distribution satisfies $\mathbb{E}_{X \in D}[|X|^{1+α}] \leq σ^{1+α}$ ($α\in (0, 1])$, i.e. perform better than it is assumed by the general lower bound for bandits with heavy-tails. Moreover, the same bound holds even when the reward distribution does not have the expectation, that is, when $α<0$.
LGFeb 3, 2025
Optimizing Online Advertising with Multi-Armed Bandits: Mitigating the Cold Start Problem under Auction DynamicsAnastasiia Soboleva, Andrey Pudovikov, Roman Snetkov et al.
Online advertising platforms often face a common challenge: the cold start problem. Insufficient behavioral data (clicks) makes accurate click-through rate (CTR) forecasting of new ads challenging. CTR for "old" items can also be significantly underestimated due to their early performance influencing their long-term behavior on the platform. The cold start problem has far-reaching implications for businesses, including missed long-term revenue opportunities. To mitigate this issue, we developed a UCB-like algorithm under multi-armed bandit (MAB) setting for positional-based model (PBM), specifically tailored to auction pay-per-click systems. Our proposed algorithm successfully combines theory and practice: we obtain theoretical upper estimates of budget regret, and conduct a series of experiments on synthetic and real-world data that confirm the applicability of the method on the real platform. In addition to increasing the platform's long-term profitability, we also propose a mechanism for maintaining short-term profits through controlled exploration and exploitation of items.