GT IT LG SY OCMar 13, 2024

Learning How to Strategically Disclose Information

Raj Kiriti Velicheti, Melih Bastopcu, S. Rasoul Etesami, Tamer Başar

arXiv:2403.08741v12.34 citationsh-index: 14ACC

Originality Incremental advance

AI Analysis

This work addresses the challenge of designing information without knowing the receiver's objective, which is incremental as it extends existing information design frameworks to online and adversarial contexts.

The paper tackles the problem of strategic information disclosure in an online setting where a sender interacts with a receiver of unknown type, achieving regret bounds such as O(√T) for general convex utility functions and O(log(T)) for a Bayesian Persuasion variant with an informativeness penalty.

Strategic information disclosure, in its simplest form, considers a game between an information provider (sender) who has access to some private information that an information receiver is interested in. While the receiver takes an action that affects the utilities of both players, the sender can design information (or modify beliefs) of the receiver through signal commitment, hence posing a Stackelberg game. However, obtaining a Stackelberg equilibrium for this game traditionally requires the sender to have access to the receiver's objective. In this work, we consider an online version of information design where a sender interacts with a receiver of an unknown type who is adversarially chosen at each round. Restricting attention to Gaussian prior and quadratic costs for the sender and the receiver, we show that $\mathcal{O}(\sqrt{T})$ regret is achievable with full information feedback, where $T$ is the total number of interactions between the sender and the receiver. Further, we propose a novel parametrization that allows the sender to achieve $\mathcal{O}(\sqrt{T})$ regret for a general convex utility function. We then consider the Bayesian Persuasion problem with an additional cost term in the objective function, which penalizes signaling policies that are more informative and obtain $\mathcal{O}(\log(T))$ regret. Finally, we establish a sublinear regret bound for the partial information feedback setting and provide simulations to support our theoretical results.

View on arXiv PDF

Similar