SYJan 9, 2022
A Class of Hybrid LQG Mean Field Games with State-Invariant Switching and Stopping StrategiesDena Firoozi, Ali Pakniyat, Peter E. Caines
A novel framework is presented that combines Mean Field Game (MFG) theory and Hybrid Optimal Control (HOC) theory to obtain a unique $ε$-Nash equilibrium for a non-cooperative game with switching and stopping times. We consider the case where there exists one major agent with a significant influence on the system together with a large number of minor agents constituting two subpopulations, each agent with individually asymptotically negligible effect on the whole system. Each agent has stochastic linear dynamics with quadratic costs, and the agents are coupled in their dynamics and costs by the average state of minor agents (i.e. the empirical mean field). It is shown that for a class of Hybrid LQG MFGs, the optimal switching and stopping times are state-invariant and only depend on the dynamical parameters of each agent. Accordingly, a hybrid systems formulation of the game is presented via the indexing by discrete events: (i) the switching of the major agent between alternative dynamics or (ii) the termination of the agents' trajectories in one or both of the subpopulations of minor agents. Optimal switchings and stopping time strategies together with best response control actions for, respectively, the major agent and all minor agents are established with respect to their individual cost criteria by an application of Hybrid LQG MFG theory.
55.8SYMay 2
Hybrid Optimal Control of Homogeneous Epidemiological Compartmental Models with Regime SwitchingTyler Halterman, Ali Pakniyat
Optimal intervention design is formulated as a hybrid optimal control problem for multiphase homogeneous epidemiological systems. The system extends a foundational compartmental model through intermediate phases that incorporate work-from-home (WFH) policies and a vaccination protocol, yielding a four-phase hybrid system that captures policy escalation and relaxation. Key characteristics of the resulting hybrid system include (i) phase-dependent continuous dynamics and running costs that respectively capture distinct disease transmission mechanisms and shifting public health socioeconomic trade-offs, (ii) a combination of autonomous and controlled switchings for intervention policies, whose times are co-optimized - whether indirectly via state thresholds or directly as decision variables alongside continuous inputs to minimize the overall cost, and (iii) nontrivial state jump maps that govern transitions between phases with differing state and control space dimensions. The Hybrid Minimum Principle (HMP) is invoked to obtain the optimal solutions. Numerical results demonstrate that coordinating WFH policies with vaccination efforts provides improved mitigation of disease spread compared to single-phase policy interventions.
11.8SYApr 1
Robust Multi-Agent Safety via Tube-Based Tightened Exponential Barrier FunctionsArmel Koulong, Ali Pakniyat
This paper presents a constructive framework for synthesizing provably safe controllers for nonlinear multi-agent systems subject to bounded disturbances. The methodology applies to systems representable in Brunovsky canonical form, accommodating arbitrary-order dynamics in multi-dimensional spaces. The central contribution is a method of constraint tightening that formally couples robust error feedback with nominal trajectory planning. The key insight is that the design of an ancillary feedback law, which confines state errors to a robust positively invariant (RPI) tube, simultaneously provides the exact information needed to ensure the safety of the nominal plan. Specifically, the geometry of the resulting RPI tube is leveraged via its support function to derive state-dependent safety margins. These margins are then used to systematically tighten the high relative-degree exponential control barrier function (eCBF) constraints imposed on the nominal planner. This integrated synthesis guarantees that any nominal trajectory satisfying the tightened constraints corresponds to a provably safe trajectory for the true, disturbed system. We demonstrate the practical utility of this formal synthesis method by implementing the planner within a distributed Model Predictive Control (MPC) scheme, which optimizes performance while inheriting the robust safety guarantees.
19.5SYApr 1
Tube-Based Safety for Anticipative Tracking in Multi-Agent SystemsArmel Koulong, Ali Pakniyat
A tube-based safety framework is presented for robust anticipative tracking in nonlinear Brunovsky multi-agent systems subject to bounded disturbances. The architecture establishes robust safety certificates for a feedforward-augmented ancillary control policy. By rendering the state-deviation dynamics independent of the agents' internal nonlinearities, the formulation strictly circumvents the restrictive Lipschitz-bound feasibility conditions otherwise required for robust stabilization. Consequently, this structure admits an explicit, closed-form robust positively invariant (RPI) tube radius that systematically attenuates the exponential control barrier function (eCBF) tightening margins, thereby mitigating constraint conservatism while preserving formal forward invariance. Within the distributed model predictive control (MPC) layer, mapping the local tube radii through the communication graph yields a closed-form global formation error bound formulated via the minimum singular value of the augmented Laplacian. Robust inter-agent safety is enforced with minimal communication overhead, requiring only a single scalar broadcast per neighbor at initialization. Numerical simulations confirm the framework's efficacy in safely navigating heterogeneous formations through cluttered environments.
OCMar 26, 2021
Value Function Estimators for Feynman-Kac Forward-Backward SDEs in Stochastic Optimal ControlKelsey P. Hawkins, Ali Pakniyat, Panagiotis Tsiotras
Two novel numerical estimators are proposed for solving forward-backward stochastic differential equations (FBSDEs) appearing in the Feynman-Kac representation of the value function in stochastic optimal control problems. In contrast to the current numerical approaches which are based on the discretization of the continuous-time FBSDE, we propose a converse approach, namely, we obtain a discrete-time approximation of the on-policy value function, and then we derive a discrete-time estimator that resembles the continuous-time counterpart. The proposed approach allows for the construction of higher accuracy estimators along with error analysis. The approach is applied to the policy improvement step in reinforcement learning. Numerical results and error analysis are demonstrated using (i) a scalar nonlinear stochastic optimal control problem and (ii) a four-dimensional linear quadratic regulator (LQR) problem. The proposed estimators show significant improvement in terms of accuracy in both cases over Euler-Maruyama-based estimators used in competing approaches. In the case of LQR problems, we demonstrate that our estimators result in near machine-precision level accuracy, in contrast to previously proposed methods that can potentially diverge on the same problems.
OCJun 22, 2020
Forward-Backward Rapidly-Exploring Random Trees for Stochastic Optimal ControlKelsey P. Hawkins, Ali Pakniyat, Evangelos Theodorou et al.
We propose a numerical method for the computation of the forward-backward stochastic differential equations (FBSDE) appearing in the Feynman-Kac representation of the value function in stochastic optimal control problems. By the use of the Girsanov change of probability measures, it is demonstrated how a rapidly-exploring random tree (RRT) method can be utilized for the forward integration pass, as long as the controlled drift terms are appropriately compensated in the backward integration pass. Subsequently, a numerical approximation of the value function is proposed by solving a series of function approximation problems backwards in time along the edges of the constructed RRT. Moreover, a local entropy-weighted least squares Monte Carlo (LSMC) method is developed to concentrate function approximation accuracy in regions most likely to be visited by optimally controlled trajectories. The results of the proposed methodology are demonstrated on linear and nonlinear stochastic optimal control problems with non-quadratic running costs, which reveal significant convergence improvements over previous FBSDE-based numerical solution methods.