Meta-Learning for Repeated Bayesian Persuasion
This work addresses the challenge of optimizing persuasion strategies in repeated interactions for applications like advertising or recommendation systems, representing an incremental advance by extending single-game frameworks to meta-learning settings.
The paper tackles the problem of repeated Bayesian persuasion across multiple games by introducing Meta-Persuasion algorithms, achieving provably sharper regret rates under task similarity and recovering standard guarantees for arbitrary game sequences.
Classical Bayesian persuasion studies how a sender influences receivers through carefully designed signaling policies within a single strategic interaction. In many real-world environments, such interactions are repeated across multiple games, creating opportunities to exploit structural similarity across tasks. In this work, we introduce Meta-Persuasion algorithms, establishing the first line of theoretical results for both full-feedback and bandit-feedback settings in the Online Bayesian Persuasion (OBP) and Markov Persuasion Process (MPP) frameworks. We show that our proposed meta-persuasion algorithms achieve provably sharper regret rates under natural notions of task similarity, improving upon the best-known convergence rates for both OBP and MPP. At the same time, they recover the standard single-game guarantees when the sequence of games is picked arbitrarily. Finally, we complement our theoretical analysis with numerical experiments that highlight our regret improvements and the benefits of meta-learning in repeated persuasion environments.