Lacra Pavel

OC
h-index20
13papers
422citations
Novelty48%
AI Score30

13 Papers

OCMar 15, 2017
A distributed primal-dual algorithm for computation of generalized Nash equilibria with shared affine coupling constraints via operator splitting methods

Peng Yi, Lacra Pavel · utoronto

In this paper, we propose a distributed primal-dual algorithm for computation of a generalized Nash equilibrium (GNE) in noncooperative games over network systems. In the considered game, not only each player's local objective function depends on other players' decisions, but also the feasible decision sets of all the players are coupled together with a globally shared affine inequality constraint. Adopting the variational GNE, that is the solution of a variational inequality, as a refinement of GNE, we introduce a primal-dual algorithm that players can use to seek it in a distributed manner. Each player only needs to know its local objective function, local feasible set, and a local block of the affine constraint. Meanwhile, each player only needs to observe the decisions on which its local objective function explicitly depends through the interference graph and share information related to multipliers with its neighbors through a multiplier graph. Through a primal-dual analysis and an augmentation of variables, we reformulate the problem as finding the zeros of a sum of monotone operators. Our distributed primal-dual algorithm is based on forward-backward operator splitting methods. We prove its convergence to the variational GNE for fixed step-sizes under some mild assumptions. Then a distributed algorithm with inertia is also introduced and analyzed for variational GNE seeking. Finally, numerical simulations for network Cournot competition are given to illustrate the algorithm efficiency and performance.

OCMar 6, 2019
Dynamic NE Seeking for Multi-Integrator Networked Agents with Disturbance Rejection

Andrew R Romano, Lacra Pavel · utoronto

In this paper, we consider game problems played by (multi)-integrator agents, subject to external disturbances. We propose Nash equilibrium seeking dynamics based on gradient-play, augmented with a dynamic internal-model based component, which is a reduced-order observer of the disturbance. We consider single-, double- and extensions to multi-integrator agents, in a partial-information setting, where agents have only partial knowledge on the others' decisions over a network. The lack of global information is offset by each agent maintaining an estimate of the others' states, based on local communication with its neighbours. Each agent has an additional dynamic component that drives its estimates to the consensus subspace. In all cases, we show convergence to the Nash equilibrium irrespective of disturbances. Our proofs leverage input-to-state stability under strong monotonicity of the pseudo-gradient and Lipschitz continuity of the extended pseudo-gradient.

GTMar 28, 2017
A Distributed Nash Equilibrium Seeking in Networked Graphical Games

Farzad Salehisadaghiani, Lacra Pavel · utoronto

This paper considers a distributed gossip approach for finding a Nash equilibrium in networked games on graphs. In such games a player's cost function may be affected by the actions of any subset of players. An interference graph is employed to illustrate the partially-coupled cost functions and the asymmetric information requirements. For a given interference graph, network communication between players is considered to be limited. A generalized communication graph is designed so that players exchange only their required information. An algorithm is designed whereby players, with possibly partially-coupled cost functions, make decisions based on the estimates of other players' actions obtained from local neighbors. It is shown that this choice of communication graph guarantees that all players' information is exchanged after sufficiently many iterations. Using a set of standard assumptions on the cost functions, the interference and the communication graphs, almost sure convergence to a Nash equilibrium is proved for diminishing step sizes. Moreover, the case when the cost functions are not known by the players is investigated and a convergence proof is presented for diminishing step sizes. The effect of the second largest eigenvalue of the expected communication matrix on the convergence rate is quantified. The trade-off between parameters associated with the communication graph and the ones associated with the interference graph is illustrated. Numerical results are presented for a large-scale networked game.

SYDec 1, 2016
Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers

Farzad Salehisadaghiani, Lacra Pavel · utoronto

In this paper, the problem of finding a Nash equilibrium of a multi-player game is considered. The players are only aware of their own cost functions as well as the action space of all players. We develop a relatively fast algorithm within the framework of inexact-ADMM. It requires a communication graph for the information exchange between the players as well as a few mild assumptions on cost functions. The convergence proof of the algorithm to a Nash equilibrium of the game is then provided. Moreover, the convergence rate is investigated via simulations.

SYMar 30, 2017
Nash Equilibrium Seeking with Non-doubly Stochastic Communication Weight Matrix

Farzad Salehisadaghiani, Lacra Pavel · utoronto

A distributed Nash equilibrium seeking algorithm is presented for networked games. We assume an incomplete information available to each player about the other players' actions. The players communicate over a strongly connected digraph to send/receive the estimates of the other players' actions to/from the other local players according to a gossip communication protocol. Due to asymmetric information exchange between the players, a non-doubly (row) stochastic weight matrix is defined. We show that, due to the non-doubly stochastic property, the total average of all players' estimates is not preserved for the next iteration which results in having no exact convergence. We present an almost sure convergence proof of the algorithm to a Nash equilibrium of the game. Then, we extend the algorithm for graphical games in which all players' cost functions are only dependent on the local neighboring players over an interference digraph. We design an assumption on the communication digraph such that the players are able to update all the estimates of the players who interfere with their cost functions. It is shown that the communication digraph needs to be a superset of a transitive reduction of the interference digraph. Finally, we verify the efficacy of the algorithm via a simulation on a social media behavioral case.

SYOct 6, 2016
Distributed Nash Equilibrium Seeking By Gossip in Games on Graphs

Farzad Salehisadaghiani, Lacra Pavel · utoronto

We consider a gossip approach for finding a Nash equilibrium in a distributed multi-player network game. We extend previous results on Nash equilibrium seeking to the case when the players' cost functions may be affected by the actions of any subset of players. An interference graph is employed to illustrate the partially-coupled cost functions and the asymmetric information requirements. For a given interference graph, we design a generalized communication graph so that players with possibly partially-coupled cost functions exchange only their required information and make decisions based on them. Using a set of standard assumptions on the cost functions, interference and communication graphs, we prove almost sure convergence to a Nash equilibrium for diminishing step sizes. We then quantify the effect of the second largest eigenvalue of the expected communication matrix on the convergence rate, and illustrate the trade-off between the parameters associated with the communication and the interference graphs. Finally, the efficacy of the proposed algorithm on a large-scale networked game is demonstrated via simulation.

GTMar 24, 2017
Generalized Nash Equilibrium Problem by the Alternating Direction Method of Multipliers

Farzad Salehisadaghiani, Lacra Pavel · utoronto

In this paper, the problem of finding a generalized Nash equilibrium (GNE) of a networked game is studied. Players are only able to choose their decisions from a feasible action set. The feasible set is considered to be a private linear equality constraint that is coupled through decisions of the other players. We consider that each player has his own private constraint and it has not to be shared with the other players. This general case also embodies the one with shared constraints between players and it can be also simply extended to the case with inequality constraints. Since the players don't have access to other players' actions, they need to exchange estimates of others' actions and a local copy of the Lagrangian multiplier with their neighbors over a connected communication graph. We develop a relatively fast algorithm by reformulating the conservative GNE problem within the framework of inexact-ADMM. The convergence of the algorithm is guaranteed under a few mild assumptions on cost functions. Finally, the algorithm is simulated for a wireless ad-hoc network.

LGOct 29, 2022
Recursive Reasoning in Minimax Games: A Level $k$ Gradient Play Method

Zichu Liu, Lacra Pavel · utoronto

Despite the success of generative adversarial networks (GANs) in generating visually appealing images, they are notoriously challenging to train. In order to stabilize the learning dynamics in minimax games, we propose a novel recursive reasoning algorithm: Level $k$ Gradient Play (Lv.$k$ GP) algorithm. In contrast to many existing algorithms, our algorithm does not require sophisticated heuristics or curvature information. We show that as $k$ increases, Lv.$k$ GP converges asymptotically towards an accurate estimation of players' future strategy. Moreover, we justify that Lv.$\infty$ GP naturally generalizes a line of provably convergent game dynamics which rely on predictive updates. Furthermore, we provide its local convergence property in nonconvex-nonconcave zero-sum games and global convergence in bilinear and quadratic games. By combining Lv.$k$ GP with Adam optimizer, our algorithm shows a clear advantage in terms of performance and computational overhead compared to other methods. Using a single Nvidia RTX3090 GPU and 30 times fewer parameters than BigGAN on CIFAR-10, we achieve an FID of 10.17 for unconditional image generation within 30 hours, allowing GAN training on common computational resources to reach state-of-the-art performance.

GTMar 26, 2024
Paths to Equilibrium in Games

Bora Yongacoglu, Gürdal Arslan, Lacra Pavel et al.

In multi-agent reinforcement learning (MARL) and game theory, agents repeatedly interact and revise their strategies as new data arrives, producing a sequence of strategy profiles. This paper studies sequences of strategies satisfying a pairwise constraint inspired by policy updating in reinforcement learning, where an agent who is best responding in one period does not switch its strategy in the next period. This constraint merely requires that optimizing agents do not switch strategies, but does not constrain the non-optimizing agents in any way, and thus allows for exploration. Sequences with this property are called satisficing paths, and arise naturally in many MARL algorithms. A fundamental question about strategic dynamics is such: for a given game and initial strategy profile, is it always possible to construct a satisficing path that terminates at an equilibrium? The resolution of this question has implications about the capabilities or limitations of a class of MARL algorithms. We answer this question in the affirmative for normal-form games. Our analysis reveals a counterintuitive insight that reward deteriorating strategic updates are key to driving play to equilibrium along a satisficing path.

OCNov 18, 2021
Second-Order Mirror Descent: Convergence in Games Beyond Averaging and Discounting

Bolin Gao, Lacra Pavel

In this paper, we propose a second-order extension of the continuous-time game-theoretic mirror descent (MD) dynamics, referred to as MD2, which provably converges to mere (but not necessarily strict) variationally stable states (VSS) without using common auxiliary techniques such as time-averaging or discounting. We show that MD2 enjoys no-regret as well as an exponential rate of convergence towards strong VSS upon a slight modification. MD2 can also be used to derive many novel continuous-time primal-space dynamics. We then use stochastic approximation techniques to provide a convergence guarantee of discrete-time MD2 with noisy observations towards interior mere VSS. Selected simulations are provided to illustrate our results.

OCDec 7, 2019
Continuous-time Discounted Mirror-Descent Dynamics in Monotone Concave Games

Bolin Gao, Lacra Pavel

In this paper, we consider concave continuous-kernel games characterized by monotonicity properties and propose discounted mirror descent-type dynamics. We introduce two classes of dynamics whereby the associated mirror map is constructed based on a strongly convex or a Legendre regularizer. Depending on the properties of the regularizer we show that these new dynamics can converge asymptotically in concave games with monotone (negative) pseudo-gradient. Furthermore, we show that when the regularizer enjoys strong convexity, the resulting dynamics can converge even in games with hypo-monotone (negative) pseudo-gradient, which corresponds to a shortage of monotonicity.

LGFeb 7, 2018
From Game-theoretic Multi-agent Log Linear Learning to Reinforcement Learning

Mohammadhosein Hasanbeig, Lacra Pavel

The main focus of this paper is on enhancement of two types of game-theoretic learning algorithms: log-linear learning and reinforcement learning. The standard analysis of log-linear learning needs a highly structured environment, i.e. strong assumptions about the game from an implementation perspective. In this paper, we introduce a variant of log-linear learning that provides asymptotic guarantees while relaxing the structural assumptions to include synchronous updates and limitations in information available to the players. On the other hand, model-free reinforcement learning is able to perform even under weaker assumptions on players' knowledge about the environment and other players' strategies. We propose a reinforcement algorithm that uses a double-aggregation scheme in order to deepen players' insight about the environment and constant learning step-size which achieves a higher convergence rate. Numerical experiments are conducted to verify each algorithm's robustness and performance.

OCApr 3, 2017
On the Properties of the Softmax Function with Application in Game Theory and Reinforcement Learning

Bolin Gao, Lacra Pavel

In this paper, we utilize results from convex analysis and monotone operator theory to derive additional properties of the softmax function that have not yet been covered in the existing literature. In particular, we show that the softmax function is the monotone gradient map of the log-sum-exp function. By exploiting this connection, we show that the inverse temperature parameter determines the Lipschitz and co-coercivity properties of the softmax function. We then demonstrate the usefulness of these properties through an application in game-theoretic reinforcement learning.