SIJul 12, 2016
Networked SIS Epidemics with AwarenessKeith Paarporn, Ceyhun Eksin, Joshua S. Weitz et al.
We study an SIS epidemic process over a static contact network where the nodes have partial information about the epidemic state. They react by limiting their interactions with their neighbors when they believe the epidemic is currently prevalent. A node's awareness is weighted by the fraction of infected neighbors in their social network, and a global broadcast of the fraction of infected nodes in the entire network. The dynamics of the benchmark (no awareness) and awareness models are described by discrete-time Markov chains, from which mean-field approximations (MFA) are derived. The states of the MFA are interpreted as the nodes' probabilities of being infected. We show a sufficient condition for existence of a "metastable", or endemic, state of the awareness model coincides with that of the benchmark model. Furthermore, we use a coupling technique to give a full stochastic comparison analysis between the two chains, which serves as a probabilistic analogue to the MFA analysis. In particular, we show that adding awareness reduces the expectation of any epidemic metric on the space of sample paths, e.g. eradication time or total infections. We characterize the reduction in expectations in terms of the coupling distribution. In simulations, we evaluate the effect social distancing has on contact networks from different random graph families (geometric, Erdős-Renyi, and scale-free random networks).
SYApr 3, 2018
Distributed Inertial Best-Response DynamicsBrian Swenson, Ceyhun Eksin, Soummya Kar et al.
The note considers the problem of computing pure Nash equilibrium (NE) strategies in distributed (i.e., network-based) settings. The paper studies a class of inertial best response dynamics based on the fictitious play (FP) algorithm. It is shown that inertial best response dynamics are robust to informational limitations common in distributed settings. Fully distributed variants of FP with inertia and joint strategy FP with inertia are developed and convergence is proven to the set of pure NE. The distributed algorithms rely on consensus methods. Results are validated using numerical simulations.
SYFeb 1, 2013
Bayesian Quadratic Network Game FiltersCeyhun Eksin, Pooya Molavi, Alejandro Ribeiro et al.
A repeated network game where agents have quadratic utilities that depend on information externalities -- an unknown underlying state -- as well as payoff externalities -- the actions of all other agents in the network -- is considered. Agents play Bayesian Nash Equilibrium strategies with respect to their beliefs on the state of the world and the actions of all other nodes in the network. These beliefs are refined over subsequent stages based on the observed actions of neighboring peers. This paper introduces the Quadratic Network Game (QNG) filter that agents can run locally to update their beliefs, select corresponding optimal actions, and eventually learn a sufficient statistic of the network's state. The QNG filter is demonstrated on a Cournot market competition game and a coordination game to implement navigation of an autonomous team.
SYMar 18, 2018
Optimal control policies for evolutionary dynamics with environmental feedbackKeith Paarporn, Ceyhun Eksin, Joshua S. Weitz et al.
We study a dynamical model of a population of cooperators and defectors whose actions have long-term consequences on environmental "commons" - what we term the "resource". Cooperators contribute to restoring the resource whereas defectors degrade it. The population dynamics evolve according to a replicator equation coupled with an environmental state. Our goal is to identify methods of influencing the population with the objective to maximize accumulation of the resource. In particular, we consider strategies that modify individual-level incentives. We then extend the model to incorporate a public opinion state that imperfectly tracks the true environmental state, and study strategies that influence opinion. We formulate optimal control problems and solve them using numerical techniques to characterize locally optimal control policies for three problem formulations: 1) control of incentives, and control of opinions through 2) propaganda-like strategies and 3) awareness campaigns. We show numerically that the resulting controllers in all formulations achieve the objective, albeit with an unintended consequence. The resulting dynamics include cycles between low and high resource states - a dynamical regime termed an "oscillating tragedy of the commons". This outcome may have desirable average properties, but includes risks to resource depletion. Our findings suggest the need for new approaches to controlling coupled population-environment dynamics.
GTFeb 5, 2016
Distributed Fictitious Play for Optimal Behavior of Multi-Agent Systems with Incomplete InformationCeyhun Eksin, Alejandro Ribeiro
A multi-agent system operates in an uncertain environment about which agents have different and time varying beliefs that, as time progresses, converge to a common belief. A global utility function that depends on the realized state of the environment and actions of all the agents determines the system's optimal behavior. We define the asymptotically optimal action profile as an equilibrium of the potential game defined by considering the expected utility with respect to the asymptotic belief. At finite time, however, agents have not entirely congruous beliefs about the state of the environment and may select conflicting actions. This paper proposes a variation of the fictitious play algorithm which is proven to converge to equilibrium actions if the state beliefs converge to a common distribution at a rate that is at least linear. In conventional fictitious play, agents build beliefs on others' future behavior by computing histograms of past actions and best respond to their expected payoffs integrated with respect to these histograms. In the variations developed here histograms are built using knowledge of actions taken by nearby nodes and best responses are further integrated with respect to the local beliefs on the state of the environment. We exemplify the use of the algorithm in coordination and target covering games.
SYDec 12, 2018
Control of learning in anti-coordination network gamesCeyhun Eksin, Keith Paarporn
We consider control of heterogeneous players repeatedly playing an anti-coordination network game. In an anti-coordination game, each player has an incentive to differentiate its action from its neighbors. At each round of play, players take actions according to a learning algorithm that mimics the iterated elimination of strictly dominated strategies. We show that the learning dynamics may fail to reach anti-coordination in certain scenarios. We formulate an optimization problem with the objective to reach maximum anti-coordination while minimizing the number of players to control. We consider both static and dynamic control policy formulations. Relating the problem to a minimum vertex cover problem on bipartite networks, we develop a feasible dynamic policy that is efficient to compute. Solving for optimal policies on benchmark networks show that the vertex cover based policy can be a loose upper bound when there is a potential to make use of cascades caused by the learning dynamics of uncontrolled players. We propose an algorithm that finds feasible, though possibly suboptimal, policies by sequentially adding players to control considering their cascade potential. Numerical experiments on random networks show the cascade-based algorithm can lower the control effort significantly compared to simpler control schemes.
AIOct 19, 2024
Simulation-Based Optimistic Policy Iteration For Multi-Agent MDPs with Kullback-Leibler Control CostKhaled Nakhleh, Ceyhun Eksin, Sabit Ekin
This paper proposes an agent-based optimistic policy iteration (OPI) scheme for learning stationary optimal stochastic policies in multi-agent Markov Decision Processes (MDPs), in which agents incur a Kullback-Leibler (KL) divergence cost for their control efforts and an additional cost for the joint state. The proposed scheme consists of a greedy policy improvement step followed by an m-step temporal difference (TD) policy evaluation step. We use the separable structure of the instantaneous cost to show that the policy improvement step follows a Boltzmann distribution that depends on the current value function estimate and the uncontrolled transition probabilities. This allows agents to compute the improved joint policy independently. We show that both the synchronous (entire state space evaluation) and asynchronous (a uniformly sampled set of substates) versions of the OPI scheme with finite policy evaluation rollout converge to the optimal value function and an optimal joint policy asymptotically. Simulation results on a multi-agent MDP with KL control cost variant of the Stag-Hare game validates our scheme's performance in terms of minimizing the cost return.
LGOct 28, 2019
Distributed Networked Learning with Correlated DataLingzhou Hong, Alfredo Garcia, Ceyhun Eksin
We consider a distributed estimation method in a setting with heterogeneous streams of correlated data distributed across nodes in a network. In the considered approach, linear models are estimated locally (i.e., with only local data) subject to a network regularization term that penalizes a local model that differs from neighboring models. We analyze computation dynamics (associated with stochastic gradient updates) and information exchange (associated with exchanging current models with neighboring nodes). We provide a finite-time characterization of convergence of the weighted ensemble average estimate and compare this result to federated learning, an alternative approach to estimation wherein a single model is updated by locally generated gradient updates. This comparison highlights the trade-off between speed vs precision: while model updates take place at a faster rate in federated learning, the proposed networked approach to estimation enables the identification of models with higher precision. We illustrate the method's general applicability in two examples: estimating a Markov random field using wireless sensor networks and modeling prey escape behavior of flocking birds based on a publicly available dataset.
SYSep 26, 2016
Demand Response with Communicating Rational ConsumersCeyhun Eksin, Hakan Delic, Alejandro Ribeiro
The performance of an energy system under a real-time pricing mechanism depends on the consumption behavior of its customers, which involves uncertainties. In this paper, we consider a system operator that charges its customers with a real-time price that depends on the total realized consumption. Customers have unknown and heterogeneous consumption preferences. We propose behavior models in which customers act selfishly, altruistically or as welfare-maximizers. In addition, we consider information models where customers keep their consumption levels private, communicate with a neighboring set of customers, or receive broadcasted demand from the operator. Our analysis focuses on the dispersion of the system performance under different consumption models. To this end, for each pair of behavior and information model we define and characterize optimal rational behavior, and provide a local algorithm that can be implemented by the consumption scheduler devices. Analytical comparisons of the two extreme information models, namely, private and complete information models, show that communication model reduces demand uncertainty while having negligible effect on aggregate consumer utility and welfare. In addition, we show the impact of real-time price policy parameters have on the expected welfare loss due to selfish behavior affording critical policy insights.