GTMay 7

Online Scalarization in Vector-Valued Games

Ehsan Asadollahi, Calvin Hawkins, Matthew Hale

arXiv:2605.066248.5

Predicted impact top 63% in GT · last 90 daysOriginality Incremental advance

AI Analysis

For multi-agent systems with vector-valued payoffs, this work provides a method to adaptively select scalarizations online, improving convergence to desired equilibria.

This paper introduces a bi-level learning framework for repeated multi-player vector-valued games where the scalarization is treated as an online decision variable. The proposed method achieves about 80% convergence to the preferred equilibrium compared to 50% with non-adaptive scalarization.

We study repeated multi-player vector-valued games in which a player observes a payoff vector each round and evaluates outcomes through linear scalarizations of those vectors. Different from most prior works, the choice of scalarization is treated as an online decision variable rather than a fixed modeling decision. We propose a bi-level learning framework in which an outer learner chooses a scalarization from a finite candidate class on a slow timescale, while a faster inner bandit no-regret learner selects actions using the scalar feedback induced by the chosen scalarization. Performance of this approach is defined with respect to a certain true weight vector, and the deployed scalarizations act as control signals that shape the induced payoff trajectory. We provide implementable algorithms based on bandit online mirror descent with stabilized importance weighting, and we derive finite-time performance guarantees in the form of sublinear regret bounds. Experiments on a vector-valued extension of a canonical game show that convergence to the preferred equilibrium rises from roughly $50\%$ under non-adaptive scalarization to about $80\%$ under our proposed method.

View on arXiv PDF

Similar