Complexity and Algorithms for Exploiting Quantal Opponents in Large Two-Player Games
This work addresses the challenge of exploiting human-like subrational behavior in game theory, offering incremental improvements for scalable algorithms in domains like AI and economics.
The paper tackles the problem of computing effective strategies against quantal (subrational) opponents in two-player games, proposing scalable heuristic approximations based on counterfactual regret minimization that outperform previous methods in exploiting such opponents while reducing exploitability by rational opponents.
Solution concepts of traditional game theory assume entirely rational players; therefore, their ability to exploit subrational opponents is limited. One type of subrationality that describes human behavior well is the quantal response. While there exist algorithms for computing solutions against quantal opponents, they either do not scale or may provide strategies that are even worse than the entirely-rational Nash strategies. This paper aims to analyze and propose scalable algorithms for computing effective and robust strategies against a quantal opponent in normal-form and extensive-form games. Our contributions are: (1) we define two different solution concepts related to exploiting quantal opponents and analyze their properties; (2) we prove that computing these solutions is computationally hard; (3) therefore, we evaluate several heuristic approximations based on scalable counterfactual regret minimization (CFR); and (4) we identify a CFR variant that exploits the bounded opponents better than the previously used variants while being less exploitable by the worst-case perfectly-rational opponent.