GTAIJul 22, 2020

Exploiting No-Regret Algorithms in System Design

arXiv:2007.11172v3
Originality Incremental advance
AI Analysis

This addresses a specific problem in game theory and system design for scenarios where one player aims to influence an adaptive opponent's behavior, but it is incremental as it builds on existing no-regret algorithms and minimax concepts.

The paper tackles the problem of a system designer guiding an opponent using a no-regret algorithm in a repeated zero-sum game to converge to a desired mixed strategy, by designing a payoff matrix with a unique minimax solution and proposing a new algorithm that provably achieves convergence.

We investigate a repeated two-player zero-sum game setting where the column player is also a designer of the system, and has full control on the design of the payoff matrix. In addition, the row player uses a no-regret algorithm to efficiently learn how to adapt their strategy to the column player's behaviour over time in order to achieve good total payoff. The goal of the column player is to guide her opponent to pick a mixed strategy which is favourable for the system designer. Therefore, she needs to: (i) design an appropriate payoff matrix $A$ whose unique minimax solution contains the desired mixed strategy of the row player; and (ii) strategically interact with the row player during a sequence of plays in order to guide her opponent to converge to that desired behaviour. To design such a payoff matrix, we propose a novel solution that provably has a unique minimax solution with the desired behaviour. We also investigate a relaxation of this problem where uniqueness is not required, but all the minimax solutions have the same mixed strategy for the row player. Finally, we propose a new game playing algorithm for the system designer and prove that it can guide the row player, who may play a \emph{stable} no-regret algorithm, to converge to a minimax solution.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes