Peter E. Caines

LG
h-index3
9papers
17citations
Novelty45%
AI Score37

9 Papers

SYJun 11, 2020
Convex Analysis for LQG Systems with Applications to Major Minor LQG Mean-Field Game Systems

Dena Firoozi, Sebastian Jaimungal, Peter E. Caines

We develop a convex analysis approach for solving LQG optimal control problems and apply it to major-minor (MM) LQG mean-field game (MFG) systems. The approach retrieves the best response strategies for the major agent and all minor agents that attain an $ε$-Nash equilibrium. An important and distinctive advantage to this approach is that unlike the classical approach in the literature, we are able to avoid imposing assumptions on the evolution of the mean-field. In particular, this provides a tool for dealing with complex and non-standard systems.

OCSep 10, 2020
$ε$-Nash Equilibria for Major Minor LQG Mean Field Games with Partial Observations of All Agents

Dena Firoozi, Peter E. Caines

The partially observed major minor LQG and nonlinear mean field game (PO MM LQG MFG) systems where it is assumed the major agent's state is partially observed by each minor agent, and the major agent completely observes its own state have been analysed in the literature. In this paper, PO MM LQG MFG problems with general information patterns are studied where (i) the major agent has partial observations of its own state, and (ii) each minor agent has partial observations of its own state and the major agent's state. The assumption of partial observations by all agents leads to a new situation involving the recursive estimation by each minor agent of the major agent's estimate of its own state. For a general case of indefinite LQG MFG systems, the existence of $ε$-Nash equilibria together with the individual agents' control laws yielding the equilibria are established via the Separation Principle.

SYJan 9, 2022
A Class of Hybrid LQG Mean Field Games with State-Invariant Switching and Stopping Strategies

Dena Firoozi, Ali Pakniyat, Peter E. Caines

A novel framework is presented that combines Mean Field Game (MFG) theory and Hybrid Optimal Control (HOC) theory to obtain a unique $ε$-Nash equilibrium for a non-cooperative game with switching and stopping times. We consider the case where there exists one major agent with a significant influence on the system together with a large number of minor agents constituting two subpopulations, each agent with individually asymptotically negligible effect on the whole system. Each agent has stochastic linear dynamics with quadratic costs, and the agents are coupled in their dynamics and costs by the average state of minor agents (i.e. the empirical mean field). It is shown that for a class of Hybrid LQG MFGs, the optimal switching and stopping times are state-invariant and only depend on the dynamical parameters of each agent. Accordingly, a hybrid systems formulation of the game is presented via the indexing by discrete events: (i) the switching of the major agent between alternative dynamics or (ii) the termination of the agents' trajectories in one or both of the subpopulations of minor agents. Optimal switchings and stopping time strategies together with best response control actions for, respectively, the major agent and all minor agents are established with respect to their individual cost criteria by an application of Hybrid LQG MFG theory.

LGAug 7, 2022
Transmission Neural Networks: From Virus Spread Models to Neural Networks

Shuang Gao, Peter E. Caines

This work connects models for virus spread on networks with their equivalent neural network representations. Based on this connection, we propose a new neural network architecture, called Transmission Neural Networks (TransNNs) where activation functions are primarily associated with links and are allowed to have different activation levels. Furthermore, this connection leads to the discovery and the derivation of three new activation functions with tunable or trainable parameters. Moreover, we prove that TransNNs with a single hidden layer and a fixed non-zero bias term are universal function approximators. Finally, we present new fundamental derivations of continuous time epidemic network models based on TransNNs.

OCJul 23, 2019
Mean Field Game Systems with Common Noise and Markovian Latent Processes

Dena Firoozi, Peter E. Caines, Sebastian Jaimungal

In many stochastic games stemming from financial models, the environment evolves with latent factors and there may be common noise across agents' states. Two classic examples are: (i) multi-agent trading on electronic exchanges, and (ii) systemic risk induced through inter-bank lending/borrowing. Moreover, agents' actions often affect the environment, and some agent's may be small while others large. Hence sub-population of agents may act as minor agents, while another class may act as major agents. To capture the essence of such problems, here, we introduce a general class of non-cooperative heterogeneous stochastic games with one major agent and a large population of minor agents where agents interact with an observed common process impacted by the mean field. A latent Markov chain and a latent Wiener process (common noise) modulate the common process, and agents cannot observe them. We use filtering techniques coupled with a convex analysis approach to (i) solve the mean field game limit of the problem, (ii) demonstrate that the best response strategies generate an $ε$-Nash equilibrium for finite populations, and (iii) obtain explicit characterisations of the best response strategies.

SYOct 30, 2016
On the Control of Affine Systems with Safety Constraints: Relaxed In-Block Controllability

Mohamed K. Helwa, Peter E. Caines

We consider affine systems defined on polytopes and study the cases where the systems are not in-block controllable with respect to the given polytopes. That are the cases in which we cannot fully control the affine systems within the interior of a given polytope, representing the intersection of given safety constraints. Instead, we introduce in this paper the notion of relaxed in-block controllability (RIBC), which can be useful for the cases where one can distinguish between soft and hard safety constraints. In particular, we study whether all the states in the interior of a given polytope, formed by the intersection of soft safety constraints, are mutually accessible through the interior of a given bigger polytope, formed by the intersection of hard safety constraints, by applying uniformly bounded control inputs. By exploring the geometry of the problem, we provide necessary conditions for RIBC. We then show when these conditions are also sufficient. Several illustrative examples are also given to clarify the main results.

27.9SIApr 5
Transmission Neural Networks: Inhibitory and Excitatory Connections

Shuang Gao, Peter E. Caines

This paper extends the Transmission Neural Network model proposed by Gao and Caines in [1]-[3] to incorporate inhibitory connections and neurotransmitter populations. The extended network model contains binary neuronal states, transmission dynamics, and inhibitory and excitatory connections. Under technical assumptions, we establish the characterization of the firing probabilities of neurons, and show that such a characterization considering inhibitions can be equivalently represented by a neural network where each neuron has a continuous state of dimension 2. Moreover, we incorporated neurotransmitter populations into the modeling and establish the limit network model when the number of neurotransmitters at all synaptic connections go to infinity. Finally, sufficient conditions for stability and contraction properties of the limit network model are established.

LGNov 27, 2024
Concentration of Cumulative Reward in Markov Decision Processes

Borna Sayedana, Peter E. Caines, Aditya Mahajan

In this paper, we investigate the concentration properties of cumulative rewards in Markov Decision Processes (MDPs), focusing on both asymptotic and non-asymptotic settings. We introduce a unified approach to characterize reward concentration in MDPs, covering both infinite-horizon settings (i.e., average and discounted reward frameworks) and finite-horizon setting. Our asymptotic results include the law of large numbers, the central limit theorem, and the law of iterated logarithms, while our non-asymptotic bounds include Azuma-Hoeffding-type inequalities and a non-asymptotic version of the law of iterated logarithms. Additionally, we explore two key implications of our results. First, we analyze the sample path behavior of the difference in rewards between any two stationary policies. Second, we show that two alternative definitions of regret for learning policies proposed in the literature are rate-equivalent. Our proof techniques rely on a novel martingale decomposition of cumulative rewards, properties of the solution to the policy evaluation fixed-point equation, and both asymptotic and non-asymptotic concentration results for martingale difference sequences.

LGDec 20, 2021
Strong Consistency and Rate of Convergence of Switched Least Squares System Identification for Autonomous Markov Jump Linear Systems

Borna Sayedana, Mohammad Afshari, Peter E. Caines et al.

In this paper, we investigate the problem of system identification for autonomous Markov jump linear systems (MJS) with complete state observations. We propose switched least squares method for identification of MJS, show that this method is strongly consistent, and derive data-dependent and data-independent rates of convergence. In particular, our data-independent rate of convergence shows that, almost surely, the system identification error is $\mathcal{O}\big(\sqrt{\log(T)/T} \big)$ where $T$ is the time horizon. These results show that switched least squares method for MJS has the same rate of convergence as least squares method for autonomous linear systems. We derive our results by imposing a general stability assumption on the model called stability in the average sense. We show that stability in the average sense is a weaker form of stability compared to the stability assumptions commonly imposed in the literature. We present numerical examples to illustrate the performance of the proposed method.