GTLGMay 29, 2022

No-regret Learning in Repeated First-Price Auctions with Budget Constraints

arXiv:2205.14572v114 citationsh-index: 8
Originality Incremental advance
AI Analysis

This addresses a practical challenge for advertisers in online markets shifting to first-price auctions, offering incremental improvements by extending regret bounds to budget-constrained scenarios.

The paper tackles the problem of online bidding in repeated first-price auctions with budget constraints, proposing RL-based algorithms that achieve $\widetilde O(\sqrt T)$-regret with full bid revelation and $\widetilde O(T^{ rac{7}{12}})$-regret with only winning bid information.

Recently the online advertising market has exhibited a gradual shift from second-price auctions to first-price auctions. Although there has been a line of works concerning online bidding strategies in first-price auctions, it still remains open how to handle budget constraints in the problem. In the present paper, we initiate the study for a buyer with budgets to learn online bidding strategies in repeated first-price auctions. We propose an RL-based bidding algorithm against the optimal non-anticipating strategy under stationary competition. Our algorithm obtains $\widetilde O(\sqrt T)$-regret if the bids are all revealed at the end of each round. With the restriction that the buyer only sees the winning bid after each round, our modified algorithm obtains $\widetilde O(T^{\frac{7}{12}})$-regret by techniques developed from survival analysis. Our analysis extends to the more general scenario where the buyer has any bounded instantaneous utility function with regrets of the same order.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes