33.0GTMay 26
On characterization and existence of constrained correlated equilibria in Markov gamesTingting Ni, Anna Maddux, Maryam Kamgarpour
Markov games with coupling constraints model constrained dynamical decision-making involving self-interested agents, where the feasibility of an individual agent's strategy depends on the joint strategies of the others. Such games arise in numerous real-world applications involving safety requirements and budget caps, for example, in environmental management, electricity markets, and transportation systems. In unconstrained dynamical decision-making, the correlated equilibrium has emerged as a desired solution concept due to its computational tractability and amenability to learning algorithms. Understanding how coupling constraints shape correlated equilibria is a crucial step towards computing solutions in constrained Markov games. In this paper, we formalize and characterize the notion of constrained correlated equilibria for Markov games, defined as feasible joint policies where any unilateral deviation is either unprofitable or infeasible. Building on this characterization, we further study existence conditions for constrained correlated equilibria. In particular, we provide a novel existence proof of such equilibria in Markov games with coupling constraints.
49.4GTMar 18
Eliciting Truthful Feedback for Preference-Based Learning via the VCG MechanismLeo Landolt, Anna Maddux, Andreas Schlaginhaufen et al.
We study resource allocation problems in which a central planner allocates resources among strategic agents with private cost functions in order to minimize a social cost, defined as an aggregate of the agents' costs. This setting poses two main challenges: (i) the agents' cost functions may be unknown to them or difficult to specify explicitly, and (ii) agents may misreport their costs strategically. To address these challenges, we propose an algorithm that combines preference-based learning with Vickrey-Clarke-Groves (VCG) payments to incentivize truthful reporting. Our algorithm selects informative preference queries via D-optimal design, estimates cost parameters through maximum likelihood, and computes VCG allocations and payments based on these estimates. In a one-shot setting, we prove that the mechanism is approximately truthful, individually rational, and efficient up to an error of $\tilde{\mathcal O}(K^{-1/2})$ for $K$ preference queries per agent. In an online setting, these guarantees hold asymptotically with sublinear regret at a rate of $\tilde{\mathcal O}(T^{2/3})$ after $T$ rounds. Finally, we validate our approach through a numerical case study on demand response in local electricity markets.