OCApr 3, 2017
Data-Injection Attacks in Stochastic Control Systems: Detectability and Performance TradeoffsCheng-Zong Bai, Fabio Pasqualetti, Vijay Gupta
Consider a stochastic process being controlled across a communication channel. The control signal that is transmitted across the control channel can be replaced by a malicious attacker. The controller is allowed to implement any arbitrary detection algorithm to detect if an attacker is present. This work characterizes some fundamental limitations of when such an attack can be detected, and quantifies the performance degradation that an attacker that seeks to be undetected or stealthy can introduce.
SYApr 29, 2020
Distributed Synthesis of Local Controllers for Networked Systems with Arbitrary Interconnection TopologiesEtika Agarwal, S. Sivaranjani, Vijay Gupta et al.
We consider the problem of designing distributed controllers to guarantee dissipativity of a networked system comprised of dynamically coupled subsystems. We require that the control synthesis is carried out locally at the subsystem-level, without explicit knowledge of the dynamics of other subsystems in the network. We solve this problem in two steps. First, we provide distributed subsystem-level dissipativity analysis conditions whose feasibility is sufficient to guarantee dissipativity of the networked system. We then use these conditions to synthesize controllers locally at the subsystem-level, using only the knowledge of the dynamics of that subsystem, and limited information about the dissipativity of the subsystems to which it is dynamically coupled. We show that the subsystem-level controllers synthesized in this manner are sufficient to guarantee dissipativity of the networked dynamical system. We also provide an approach to make this synthesis compositional, that is, when a new subsystem is added to an existing network, only the dynamics of the new subsystem, and information about the dissipativity of the subsystems in the existing network to which it is coupled are used to design a controller for the new subsystem, while guaranteeing dissipativity of the networked system including the new subsystem. Finally, we demonstrate the application of this synthesis in enabling plug-and-play operations of generators in a microgrid by extending our results to networked switched systems.
SYFeb 22, 2019
Sequential Synthesis of Distributed Controllers for Cascade Interconnected SystemsEtika Agarwal, S. Sivaranjani, Vijay Gupta et al.
We consider the problem of designing distributed controllers to ensure passivity of a large-scale interconnection of linear subsystems connected in a cascade topology. The control design process needs to be carried out at the subsystem-level with no direct knowledge of the dynamics of other subsystems in the interconnection. We present a distributed approach to solve this problem, where subsystem-level controllers are locally designed in a sequence starting at one end of the cascade using only the dynamics of the particular subsystem, coupling with the immediately preceding subsystem and limited information from the preceding subsystem in the cascade to ensure passivity of the interconnected system up to that point. We demonstrate that this design framework also allows for new subsystems to be compositionally added to the interconnection without requiring redesign of the pre-existing controllers.
SYDec 24, 2020
Distributed Mixed Voltage Angle and Frequency Droop Control of Microgrid Interconnections with Loss of Distribution-PMU MeasurementsS Sivaranjani, Etika Agarwal, Vijay Gupta et al.
Recent advances in distribution-level phasor measurement unit (D-PMU) technology have enabled the use of voltage phase angle measurements for direct load sharing control in distribution-level microgrid interconnections with high penetration of renewable distributed energy resources (DERs). In particular, D-PMU enabled voltage angle droop control has the potential to enhance stability and transient performance in such microgrid interconnections. However, these angle droop control designs are vulnerable to D-PMU angle measurement losses that frequently occur due to the unavailability of a GPS signal for synchronization. In the event of such measurement losses, angle droop controlled microgrid interconnections may suffer from poor performance and potentially lose stability. In this paper, we propose a novel distributed mixed voltage angle and frequency droop control (D-MAFD) framework to improve the reliability of angle droop controlled microgrid interconnections. In this framework, when the D-PMU phase angle measurement is lost at a microgrid, conventional frequency droop control is temporarily used for primary control in place of angle droop control to guarantee stability. We model the microgrid interconnection with this primary control architecture as a nonlinear switched system and design distributed secondary controllers to guarantee transient stability of the network. Further, we incorporate performance specifications such as robustness to generation-load mismatch and network topology changes in the distributed control design. We demonstrate the performance of this control framework by simulation on a test 123-feeder distribution network.
SYAug 28, 2020
Mixed Voltage Angle and Frequency Droop Control for Transient Stability of Interconnected Microgrids with Loss of PMU MeasurementsS Sivaranjani, Etika Agarwal, Le Xie et al.
We consider the problem of guaranteeing transient stability of a network of interconnected angle droop controlled microgrids, where voltage phase angle measurements from phasor measurement units (PMUs) may be lost, leading to poor performance and instability. In this paper, we propose a novel mixed voltage angle and frequency droop control (MAFD) framework to improve the reliability of such angle droop controlled microgrid interconnections. In this framework, when the phase angle measurement is lost at a microgrid, conventional frequency droop control is temporarily used for primary control in place of angle droop control. We model the network of interconnected microgrids with the MAFD architecture as a nonlinear switched system. We then propose a dissipativity-based distributed secondary control design to guarantee transient stability of this network under arbitrary switching between angle droop and frequency droop controllers. We demonstrate the performance of this control framework by simulation on a test 123-feeder distribution network.
SYMar 10, 2020
Weak Control for Human-in-the-loop SystemsMasaki Inoue, Vijay Gupta
In this letter, we propose a control framework for human-in-the-loop systems, in which many human decision makers are involved in the feedback loop composed of a plant and a controller. The novelty of the framework is that the decision makers are weakly controlled; in other words, they receive a set of admissible control actions from the controller and choose one of them in accordance with their private preferences. For example, the decision makers can decide their actions to minimize their own costs or by simply relying on their experience and intuition. A class of controllers which output set-valued signals is proposed, and it is shown that the overall control system is stable independently of the decisions made by the humans. Finally, a learning algorithm is applied to the controller that updates the controller parameters to reduce the achievable minimal costs for the decision makers. Effective use of the algorithm is demonstrated in a numerical experiment.
SYJan 5, 2015
On the trade-off between control performance and communication cost in event-triggered controlBurak Demirel, Vijay Gupta, Daniel E. Quevedo et al.
We consider a stochastic system where the communication between the controller and the actuator is triggered by a threshold-based rule. The communication is performed across an unreliable link that stochastically erases transmitted packets. To decrease the communication burden, and as a partial protection against dropped packets, the controller sends a sequence of control commands to the actuator in each packet. These commands are stored in a buffer and applied sequentially until the next control packet arrives. In this context, we study dead-beat control laws and compute the expected linear-quadratic loss of the closed-loop system for any given event-threshold. Furthermore, we provide analytical expressions that quantify the trade-off between the communication cost and the control performance of event-triggered control systems. Numerical examples demonstrate the effectiveness of the proposed framework.
SYJan 30, 2018
Conic-sector-based analysis and control synthesis for linear parameter varying systemsS Sivaranjani, James Richard Forbes, Peter Seiler et al.
We present a conic sector theorem for linear parameter varying (LPV) systems in which the traditional definition of conicity is violated for certain values of the parameter. We show that such LPV systems can be defined to be conic in an average sense if the parameter trajectories are restricted so that the system operates with such values of the parameter sufficiently rarely. We then show that such an average definition of conicity is useful in analyzing the stability of the system when it is connected in feedback with a conic system with appropriate conic properties. This can be regarded as an extension of the classical conic sector theorem. Based on this modified conic sector theorem, we design conic controllers that allow the closed-loop system to operate in nonconic parameter regions for brief periods of time. Due to this extra degree of freedom, these controllers lead to less conservative performance than traditional designs, in which the controller parameters are chosen based on the largest cone that the plant dynamics are contained in. We demonstrate the effectiveness of the proposed design in stabilizing a power grid with very high penetration of renewable energy while minimizing power transmission losses.
SYMar 13, 2015
A Switched Dynamical System Framework for Analysis of Massively Parallel Asynchronous Numerical AlgorithmsKooktae Lee, Raktim Bhattacharya, Vijay Gupta
In the near future, massively parallel computing systems will be necessary to solve computation intensive applications. The key bottleneck in massively parallel implementation of numerical algorithms is the synchronization of data across processing elements (PEs) after each iteration, which results in significant idle time. Thus, there is a trend towards relaxing the synchronization and adopting an asynchronous model of computation to reduce idle time. However, it is not clear what is the effect of this relaxation on the stability and accuracy of the numerical algorithm. In this paper we present a new framework to analyze such algorithms. We treat the computation in each PE as a dynamical system and model the asynchrony as stochastic switching. The overall system is then analyzed as a switched dynamical system. However, modeling of massively parallel numerical algorithms as switched dynamical systems results in a very large number of modes, which makes current analysis tools available for such systems computationally intractable. We develop new techniques that circumvent this scalability issue. The framework is presented on a one-dimensional heat equation as a case study and the proposed analysis framework is verified by solving the partial differential equation (PDE) in a $\mathtt{nVIDIA\: Tesla^{\scriptsize{TM}}}$ GPU machine, with asynchronous communication between cores.
SYNov 20, 2017
Dissipativity of system abstractions obtained using approximate input-output simulationEtika Agarwal, Shravan Sajja, Panos J. Antsaklis et al.
This work focuses on the invariance of important properties between continuous and discrete models of systems which can be useful in the control design of large-scale systems and their software implementations. In particular, this paper discusses the relationships between the QSR dissipativity of a continuous state dynamical system and of its abstractions obtained through approximate input-output simulation relations. First, conditions to guarantee the dissipativity of the continuous system from its abstractions are provided. The reverse problem of determining the Q, S and R dissipativity matrices of the abstract system from that of the continuous system is also considered. Results characterizing the change in the dissipativity matrices are provided when the system abstraction is obtained. Since, under certain conditions, QSR dissipative systems are known to be stable, the results of this paper can be used to construct stable system abstractions as well. In the second part of this paper, we analyze the dissipativity of the approximate feedback composition of a continuous dynamical system and a discrete controller. We present illustrative examples to demonstrate the results of this paper.
OCAug 22, 2019
Passivity-Based Analysis of Sampled and Quantized Control ImplementationsXiangru Xu, Necmiye Ozay, Vijay Gupta
This paper studies the performance of a continuous controller when implemented on digital devices via sampling and quantization, by leveraging passivity analysis. Degradation of passivity indices from a continuous-time control system to its sampled, input and output quantized model is studied using a notion of quasi-passivity. Based on that, the passivity property of a feedback-connected system where the continuous controller is replaced by its sampled and quantized model is studied, and conditions that ensure the state boundedness of the interconnected system are provided. Additionally, the approximate bisimulation-based control implementation where the controller is replaced by its approximate bisimilar symbolic model whose states are also quantized is analyzed. Several examples are provided to illustrate the theoretical results.
SYFeb 5, 2018
Reliability and Market Price of Energy in the Presence of Intermittent and Non-Dispatchable Renewable EnergiesAshkan Zeinalzadeh, Donya Ghavidel, Vijay Gupta
The intermittent nature of the renewable energies increases the operation costs of conventional generators. As the share of energy supplied by renewable sources increases, these costs also increase. In this paper, we quantify these costs by developing a market clearing price of energy in the presence of renewable energy and congestion constraints. We consider an electricity market where generators propose their asking price per unit of energy to an independent system operator (ISO). The ISO solve an optimization problem to dispatch energy from each generator to minimize the total cost of energy purchased on behalf of the consumers. To ensure that the generators are able to meet the load within a desired confidence level, we incorporate the notion of load variance using the Conditional Value-at-Risk (CVAR) measure in an electricity market and we derive the amount of committed power and market clearing price of energy as a function of CVAR. It is shown that a higher penetration of renewable energies may increase the committed power, market clearing price of energy and consumer cost of energy due to renewable generation uncertainties. We also obtain an upper-bound on the amount that congestion constraints can affect the committed power. We present descriptive simulations to illustrate the impact of renewable energy penetration and reliability levels on committed power by the non-renewable generators, difference between the dispatched and committed power, market price of energy and profit of renewable and non-renewable generators.
OCOct 17, 2022
Learning Decentralized Linear Quadratic Regulators with $\sqrt{T}$ RegretLintao Ye, Ming Chi, Ruiquan Liao et al.
We propose an online learning algorithm that adaptively designs a decentralized linear quadratic regulator when the system model is unknown a priori and new data samples from a single system trajectory become progressively available. The algorithm uses a disturbance-feedback representation of state-feedback controllers coupled with online convex optimization with memory and delayed feedback. Under the assumption that the system is stable or given a known stabilizing controller, we show that our controller enjoys an expected regret that scales as $\sqrt{T}$ with the time horizon $T$ for the case of partially nested information pattern. For more general information patterns, the optimal controller is unknown even if the system model is known. In this case, the regret of our controller is shown with respect to a linear sub-optimal controller. We validate our theoretical findings using numerical experiments.
MLFeb 16, 2023
Intrinsic and extrinsic deep learning on manifoldsYihao Fang, Ilsang Ohn, Vijay Gupta et al.
We propose extrinsic and intrinsic deep neural network architectures as general frameworks for deep learning on manifolds. Specifically, extrinsic deep neural networks (eDNNs) preserve geometric features on manifolds by utilizing an equivariant embedding from the manifold to its image in the Euclidean space. Moreover, intrinsic deep neural networks (iDNNs) incorporate the underlying intrinsic geometry of manifolds via exponential and log maps with respect to a Riemannian structure. Consequently, we prove that the empirical risk of the empirical risk minimizers (ERM) of eDNNs and iDNNs converge in optimal rates. Overall, The eDNNs framework is simple and easy to compute, while the iDNNs framework is accurate and fast converging. To demonstrate the utilities of our framework, various simulation studies, and real data analyses are presented with eDNNs and iDNNs.
SYJul 25, 2022
Cooperative Actor-Critic via TD Error AggregationMartin Figura, Yixuan Lin, Ji Liu et al.
In decentralized cooperative multi-agent reinforcement learning, agents can aggregate information from one another to learn policies that maximize a team-average objective function. Despite the willingness to cooperate with others, the individual agents may find direct sharing of information about their local state, reward, and value function undesirable due to privacy issues. In this work, we introduce a decentralized actor-critic algorithm with TD error aggregation that does not violate privacy issues and assumes that communication channels are subject to time delays and packet dropouts. The cost we pay for making such weak assumptions is an increased communication burden for every agent as measured by the dimension of the transmitted data. Interestingly, the communication burden is only quadratic in the graph size, which renders the algorithm applicable in large networks. We provide a convergence analysis under diminishing step size to verify that the agents maximize the team-average objective function.
34.6LGMay 11
Quotient-Categorical Representations for Bellman-Compatible Average-Reward Distributional Reinforcement LearningEge C. Kaya, Aliasghar Pourghani, Vijay Gupta et al.
Average-reward reinforcement learning requires estimating the gain and the bias, which is defined only up to an additive constant. This makes direct distributional analogues ill-posed on the real line. We introduce a quotient-space formulation in which state-indexed bias laws are identified up to a common translation, together with a categorical parameterization that respects this symmetry. On this quotient-categorical space, we define a projected average-reward distributional operator and show that it is well-defined, non-expansive in a coordinate Cramér metric, and admits fixed points. We then study sampled recursions whose mean-field maps are asynchronous relaxations of this operator. In an idealized centered-reward setting, a one-state temporal-difference update enjoys almost sure convergence together with finite-iteration residual bounds under both i.i.d. and Markovian sampling. When the gain is unknown, we augment the recursion with an online gain estimator, and prove non-expansiveness and Markovian convergence of the resulting coupled scheme. Finally, we show that synchronous exact updates are gain-independent at the quotient-law level, isolating a structural contrast between ideal quotient distributions and practical fixed-grid categorical representations.
11.6SYMar 18
Convergence of Payoff-Based Higher-Order Replicator Dynamics in Contractive GamesHassan Abdelraouf, Vijay Gupta, Jeff S. Shamma
We study the convergence properties of a payoff-based higher-order version of replicator dynamics, a widely studied model in evolutionary dynamics and game-theoretic learning, in contractive games. Recent work has introduced a control-theoretic perspective for analyzing the convergence of learning dynamics through passivity theory, leading to a classification of learning dynamics based on the passivity notion they satisfy, such as \textdelta-passivity, equilibrium-independent passivity, and incremental passivity. We leverage this framework for the study of higher-order replicator dynamics for contractive games, which form the complement of passive learning dynamics. Standard replicator dynamics can be represented as a cascade interconnection between an integrator and the softmax mapping. Payoff-based higher-order replicator dynamics include a linear time-invariant (LTI) system in parallel with the existing integrator. First, we show that if this added system is strictly passive and asymptotically stable, then the resulting learning dynamics converge locally to the Nash equilibrium in contractive games. Second, we establish global convergence properties using incremental stability analysis for the special case of symmetric matrix contractive games.
33.7PRApr 3
The Variational Approach in Filtering and Correlated NoiseSharan Srinivasan, Vijay Gupta, Harsha Honnappa
The variational formulation of nonlinear filtering due to Mitter and Newton characterizes the filtering distribution as the unique minimizer of a free energy functional involving the relative entropy with respect to the prior and an expected energy. This formulation rests on an absolute continuity condition between the joint path measure and a product reference measure. We prove that this condition necessarily fails whenever the signal and observation diffusions share a common noise source. Specifically we show that the joint and product measures are mutually singular, so no choice of reference measure can salvage the formulation. We then introduce a conditional variational principle that replaces the prior with a reference measure that preserves the noise correlation structure. This generalization recovers the Mitter--Newton formulation as a special case when the noises are independent, and yields an explicit free energy characterization of the filter in the linear correlated-noise setting.
LGFeb 16
Coverage Guarantees for Pseudo-Calibrated Conformal Prediction under Distribution ShiftFarbod Siahkali, Ashwin Verma, Vijay Gupta
Conformal prediction (CP) offers distribution-free marginal coverage guarantees under an exchangeability assumption, but these guarantees can fail if the data distribution shifts. We analyze the use of pseudo-calibration as a tool to counter this performance loss under a bounded label-conditional covariate shift model. Using tools from domain adaptation, we derive a lower bound on target coverage in terms of the source-domain loss of the classifier and a Wasserstein measure of the shift. Using this result, we provide a method to design pseudo-calibrated sets that inflate the conformal threshold by a slack parameter to keep target coverage above a prescribed level. Finally, we propose a source-tuned pseudo-calibration algorithm that interpolates between hard pseudo-labels and randomized labels as a function of classifier uncertainty. Numerical experiments show that our bounds qualitatively track pseudo-calibration behavior and that the source-tuned scheme mitigates coverage degradation under distribution shift while maintaining nontrivial prediction set sizes.
LGMar 6, 2024
Sampling-based Safe Reinforcement Learning for Nonlinear Dynamical SystemsWesley A. Suttle, Vipul K. Sharma, Krishna C. Kosaraju et al.
We develop provably safe and convergent reinforcement learning (RL) algorithms for control of nonlinear dynamical systems, bridging the gap between the hard safety guarantees of control theory and the convergence guarantees of RL theory. Recent advances at the intersection of control and RL follow a two-stage, safety filter approach to enforcing hard safety constraints: model-free RL is used to learn a potentially unsafe controller, whose actions are projected onto safe sets prescribed, for example, by a control barrier function. Though safe, such approaches lose any convergence guarantees enjoyed by the underlying RL methods. In this paper, we develop a single-stage, sampling-based approach to hard constraint satisfaction that learns RL controllers enjoying classical convergence guarantees while satisfying hard safety constraints throughout training and deployment. We validate the efficacy of our approach in simulation, including safe control of a quadcopter in a challenging obstacle avoidance problem, and demonstrate that it outperforms existing benchmarks.
LGAug 8, 2025
Parameter-free Optimal Rates for Nonlinear Semi-Norm Contractions with Applications to $Q$-LearningAnkur Naskar, Gugan Thoppe, Vijay Gupta
Algorithms for solving \textit{nonlinear} fixed-point equations -- such as average-reward \textit{$Q$-learning} and \textit{TD-learning} -- often involve semi-norm contractions. Achieving parameter-free optimal convergence rates for these methods via Polyak--Ruppert averaging has remained elusive, largely due to the non-monotonicity of such semi-norms. We close this gap by (i.) recasting the averaged error as a linear recursion involving a nonlinear perturbation, and (ii.) taming the nonlinearity by coupling the semi-norm's contraction with the monotonicity of a suitably induced norm. Our main result yields the first parameter-free $\tilde{O}(1/\sqrt{t})$ optimal rates for $Q$-learning in both average-reward and exponentially discounted settings, where $t$ denotes the iteration index. The result applies within a broad framework that accommodates synchronous and asynchronous updates, single-agent and distributed deployments, and data streams obtained either from simulators or along Markovian trajectories.
OCJan 2, 2024
Model-Free Learning for the Linear Quadratic Regulator over Rate-Limited ChannelsLintao Ye, Aritra Mitra, Vijay Gupta
Consider a linear quadratic regulator (LQR) problem being solved in a model-free manner using the policy gradient approach. If the gradient of the quadratic cost is being transmitted across a rate-limited channel, both the convergence and the rate of convergence of the resulting controller may be affected by the bit-rate permitted by the channel. We first pose this problem in a communication-constrained optimization framework and propose a new adaptive quantization algorithm titled Adaptively Quantized Gradient Descent (AQGD). This algorithm guarantees exponentially fast convergence to the globally optimal policy, with no deterioration of the exponent relative to the unquantized setting, above a certain finite threshold bit-rate allowed by the communication channel. We then propose a variant of AQGD that provides similar performance guarantees when applied to solve the model-free LQR problem. Our approach reveals the benefits of adaptive quantization in preserving fast linear convergence rates, and, as such, may be of independent interest to the literature on compressed optimization. Our work also marks a first step towards a more general bridge between the fields of model-free control design and networked control systems.
LGNov 21, 2025
Harnessing Data from Clustered LQR Systems: Personalized and Collaborative Policy OptimizationVinay Kanakeri, Shivam Bajaj, Ashwin Verma et al.
It is known that reinforcement learning (RL) is data-hungry. To improve sample-efficiency of RL, it has been proposed that the learning algorithm utilize data from 'approximately similar' processes. However, since the process models are unknown, identifying which other processes are similar poses a challenge. In this work, we study this problem in the context of the benchmark Linear Quadratic Regulator (LQR) setting. Specifically, we consider a setting with multiple agents, each corresponding to a copy of a linear process to be controlled. The agents' local processes can be partitioned into clusters based on similarities in dynamics and tasks. Combining ideas from sequential elimination and zeroth-order policy optimization, we propose a new algorithm that performs simultaneous clustering and learning to output a personalized policy (controller) for each cluster. Under a suitable notion of cluster separation that captures differences in closed-loop performance across systems, we prove that our approach guarantees correct clustering with high probability. Furthermore, we show that the sub-optimality gap of the policy learned for each cluster scales inversely with the size of the cluster, with no additional bias, unlike in prior works on collaborative learning-based control. Our work is the first to reveal how clustering can be used in data-driven control to learn personalized policies that enjoy statistical gains from collaboration but do not suffer sub-optimality due to inclusion of data from dissimilar processes. From a distributed implementation perspective, our method is attractive as it incurs only a mild logarithmic communication overhead.
LGOct 8, 2025
Parameter-Free Federated TD Learning with Markov Noise in Heterogeneous EnvironmentsAnkur Naskar, Gugan Thoppe, Utsav Negi et al.
Federated learning (FL) can dramatically speed up reinforcement learning by distributing exploration and training across multiple agents. It can guarantee an optimal convergence rate that scales linearly in the number of agents, i.e., a rate of $\tilde{O}(1/(NT)),$ where $T$ is the iteration index and $N$ is the number of agents. However, when the training samples arise from a Markov chain, existing results on TD learning achieving this rate require the algorithm to depend on unknown problem parameters. We close this gap by proposing a two-timescale Federated Temporal Difference (FTD) learning with Polyak-Ruppert averaging. Our method provably attains the optimal $\tilde{O}(1/NT)$ rate in both average-reward and discounted settings--offering a parameter-free FTD approach for Markovian data. Although our results are novel even in the single-agent setting, they apply to the more realistic and challenging scenario of FL with heterogeneous environments.
SYSep 28, 2025
Communication-aware Wide-Area Damping Control using Risk-Constrained Reinforcement LearningKyung-bin Kwon, Lintao Ye, Vijay Gupta et al.
Non-ideal communication links, especially delays, critically affect fast networked controls in power systems, such as the wide-area damping control (WADC). Traditionally, a delay estimation and compensation approach is adopted to address this cyber-physical coupling, but it demands very high accuracy for the fast WADC and cannot handle other cyber concerns like link failures or {cyber perturbations}. Hence, we propose a new risk-constrained framework that can target the communication delays, yet amenable to general uncertainty under the cyber-physical couplings. Our WADC model includes the synchronous generators (SGs), and also voltage source converters (VSCs) for additional damping capabilities. To mitigate uncertainty, a mean-variance risk constraint is introduced to the classical optimal control cost of the linear quadratic regulator (LQR). Unlike estimating delays, our approach can effectively mitigate large communication delays by improving the worst-case performance. A reinforcement learning (RL)-based algorithm, namely, stochastic gradient-descent with max-oracle (SGDmax), is developed to solve the risk-constrained problem. We further show its guaranteed convergence to stationarity at a high probability, even using the simple zero-order policy gradient (ZOPG). Numerical tests on the IEEE 68-bus system not only verify SGDmax's convergence and VSCs' damping capabilities, but also demonstrate that our approach outperforms conventional delay compensator-based methods under estimation error. While focusing on performance improvement under large delays, our proposed risk-constrained design can effectively mitigate the worst-case oscillations, making it equally effective for addressing other communication issues and cyber perturbations.
SYFeb 7, 2025
End-to-End Learning Framework for Solving Non-Markovian Optimal ControlXiaole Zhang, Peiyu Zhang, Xiongye Xiao et al.
Integer-order calculus often falls short in capturing the long-range dependencies and memory effects found in many real-world processes. Fractional calculus addresses these gaps via fractional-order integrals and derivatives, but fractional-order dynamical systems pose substantial challenges in system identification and optimal control due to the lack of standard control methodologies. In this paper, we theoretically derive the optimal control via linear quadratic regulator (LQR) for fractional-order linear time-invariant (FOLTI) systems and develop an end-to-end deep learning framework based on this theoretical foundation. Our approach establishes a rigorous mathematical model, derives analytical solutions, and incorporates deep learning to achieve data-driven optimal control of FOLTI systems. Our key contributions include: (i) proposing an innovative system identification method control strategy for FOLTI systems, (ii) developing the first end-to-end data-driven learning framework, Fractional-Order Learning for Optimal Control (FOLOC), that learns control policies from observed trajectories, and (iii) deriving a theoretical analysis of sample complexity to quantify the number of samples required for accurate optimal control in complex real-world problems. Experimental results indicate that our method accurately approximates fractional-order system behaviors without relying on Gaussian noise assumptions, pointing to promising avenues for advanced optimal control.
LGNov 25, 2021
Robustness against Adversarial Attacks in Neural Networks using Incremental DissipativityBernardo Aquino, Arash Rahnama, Peter Seiler et al.
Adversarial examples can easily degrade the classification performance in neural networks. Empirical methods for promoting robustness to such examples have been proposed, but often lack both analytical insights and formal guarantees. Recently, some robustness certificates have appeared in the literature based on system theoretic notions. This work proposes an incremental dissipativity-based robustness certificate for neural networks in the form of a linear matrix inequality for each layer. We also propose an equivalent spectral norm bound for this certificate which is scalable to neural networks with multiple layers. We demonstrate the improved performance against adversarial attacks on a feed-forward neural network trained on MNIST and an Alexnet trained using CIFAR-10.
LGNov 24, 2021
Finite-Time Error Bounds for Distributed Linear Stochastic ApproximationYixuan Lin, Vijay Gupta, Ji Liu
This paper considers a novel multi-agent linear stochastic approximation algorithm driven by Markovian noise and general consensus-type interaction, in which each agent evolves according to its local stochastic approximation process which depends on the information from its neighbors. The interconnection structure among the agents is described by a time-varying directed graph. While the convergence of consensus-based stochastic approximation algorithms when the interconnection among the agents is described by doubly stochastic matrices (at least in expectation) has been studied, less is known about the case when the interconnection matrix is simply stochastic. For any uniformly strongly connected graph sequences whose associated interaction matrices are stochastic, the paper derives finite-time bounds on the mean-square error, defined as the deviation of the output of the algorithm from the unique equilibrium point of the associated ordinary differential equation. For the case of interconnection matrices being stochastic, the equilibrium point can be any unspecified convex combination of the local equilibria of all the agents in the absence of communication. Both the cases with constant and time-varying step-sizes are considered. In the case when the convex combination is required to be a straight average and interaction between any pair of neighboring agents may be uni-directional, so that doubly stochastic matrices cannot be implemented in a distributed manner, the paper proposes a push-sum-type distributed stochastic approximation algorithm and provides its finite-time bound for the time-varying step-size case by leveraging the analysis for the consensus-type algorithm with stochastic matrices and developing novel properties of the push-sum algorithm. Distributed temporal difference learning is discussed as an illustrative application.
LGNov 12, 2021
Resilient Consensus-based Multi-agent Reinforcement Learning with Function ApproximationMartin Figura, Yixuan Lin, Ji Liu et al.
Adversarial attacks during training can strongly influence the performance of multi-agent reinforcement learning algorithms. It is, thus, highly desirable to augment existing algorithms such that the impact of adversarial attacks on cooperative networks is eliminated, or at least bounded. In this work, we consider a fully decentralized network, where each agent receives a local reward and observes the global state and action. We propose a resilient consensus-based actor-critic algorithm, whereby each agent estimates the team-average reward and value function, and communicates the associated parameter vectors to its immediate neighbors. We show that in the presence of Byzantine agents, whose estimation and communication strategies are completely arbitrary, the estimates of the cooperative agents converge to a bounded consensus value with probability one, provided that there are at most $H$ Byzantine agents in the neighborhood of each cooperative agent and the network is $(2H+1)$-robust. Furthermore, we prove that the policy of the cooperative agents converges with probability one to a bounded neighborhood around a local maximizer of their team-average objective function under the assumption that the policies of the adversarial agents asymptotically become stationary.
OCOct 14, 2021
On the Sample Complexity of Decentralized Linear Quadratic Regulator with Partially Nested Information StructureLintao Ye, Hao Zhu, Vijay Gupta
We study the problem of control policy design for decentralized state-feedback linear quadratic control with a partially nested information structure, when the system model is unknown. We propose a model-based learning solution, which consists of two steps. First, we estimate the unknown system model from a single system trajectory of finite length, using least squares estimation. Next, based on the estimated system model, we design a control policy that satisfies the desired information structure. We show that the suboptimality gap between our control policy and the optimal decentralized control policy (designed using accurate knowledge of the system model) scales linearly with the estimation error of the system model. Using this result, we provide an end-to-end sample complexity result for learning decentralized controllers for a linear quadratic control problem with a partially nested information structure.
LGMar 12, 2021
EventGraD: Event-Triggered Communication in Parallel Machine LearningSoumyadip Ghosh, Bernardo Aquino, Vijay Gupta
Communication in parallel systems imposes significant overhead which often turns out to be a bottleneck in parallel machine learning. To relieve some of this overhead, in this paper, we present EventGraD - an algorithm with event-triggered communication for stochastic gradient descent in parallel machine learning. The main idea of this algorithm is to modify the requirement of communication at every iteration in standard implementations of stochastic gradient descent in parallel machine learning to communicating only when necessary at certain iterations. We provide theoretical analysis of convergence of our proposed algorithm. We also implement the proposed algorithm for data-parallel training of a popular residual neural network used for training the CIFAR-10 dataset and show that EventGraD can reduce the communication load by up to 60% while retaining the same level of accuracy. In addition, EventGraD can be combined with other approaches such as Top-K sparsification to decrease communication further while maintaining accuracy.
SYMar 11, 2021
Adversarial attacks in consensus-based multi-agent reinforcement learningMartin Figura, Krishna Chaitanya Kosaraju, Vijay Gupta
Recently, many cooperative distributed multi-agent reinforcement learning (MARL) algorithms have been proposed in the literature. In this work, we study the effect of adversarial attacks on a network that employs a consensus-based MARL algorithm. We show that an adversarial agent can persuade all the other agents in the network to implement policies that optimize an objective that it desires. In this sense, the standard consensus-based MARL algorithms are fragile to attacks.
NAOct 16, 2018
Better numerical approximation by Durrmeyer type operatorsAna Maria Acu, Vijay Gupta, Gancho Tachev
The main object of this paper is to construct new Durrmeyer type operators which have better features than the classical one. Some results concerning the rate of convergence and asymptotic formulas of the new operator are given. Finally, the theoretical results are analyzed by numerical examples.
MLAug 13, 2017
Encoding Multi-Resolution Brain Networks Using Unsupervised Deep LearningArash Rahnama, Abdullah Alchihabi, Vijay Gupta et al.
The main goal of this study is to extract a set of brain networks in multiple time-resolutions to analyze the connectivity patterns among the anatomic regions for a given cognitive task. We suggest a deep architecture which learns the natural groupings of the connectivity patterns of human brain in multiple time-resolutions. The suggested architecture is tested on task data set of Human Connectome Project (HCP) where we extract multi-resolution networks, each of which corresponds to a cognitive task. At the first level of this architecture, we decompose the fMRI signal into multiple sub-bands using wavelet decompositions. At the second level, for each sub-band, we estimate a brain network extracted from short time windows of the fMRI signal. At the third level, we feed the adjacency matrices of each mesh network at each time-resolution into an unsupervised deep learning algorithm, namely, a Stacked De- noising Auto-Encoder (SDAE). The outputs of the SDAE provide a compact connectivity representation for each time window at each sub-band of the fMRI signal. We concatenate the learned representations of all sub-bands at each window and cluster them by a hierarchical algorithm to find the natural groupings among the windows. We observe that each cluster represents a cognitive task with a performance of 93% Rand Index and 71% Adjusted Rand Index. We visualize the mean values and the precisions of the networks at each component of the cluster mixture. The mean brain networks at cluster centers show the variations among cognitive tasks and the precision of each cluster shows the within cluster variability of networks, across the subjects.
SYAug 9, 2017
Trade-Offs in Stochastic Event-Triggered ControlBurak Demirel, Alex S. Leong, Vijay Gupta et al.
This paper studies the optimal output-feedback control of a linear time-invariant system where a stochastic event-based scheduler triggers the communication between the sensor and the controller. The primary goal of the use of this type of scheduling strategy is to provide significant reductions in the usage of the sensor-to-controller communication and, in turn, improve energy expenditure in the network. In this paper, we aim to design an admissible control policy, which is a function of the observed output, to minimize a quadratic cost function while employing a stochastic event-triggered scheduler that preserves the Gaussian property of the plant state and the estimation error. For the infinite horizon case, we present analytical expressions that quantify the trade-off between the communication cost and control performance of such event-triggered control systems. This trade-off is confirmed quantitatively via numerical examples.
SYJun 5, 2017
Provably Safe Cruise Control of Vehicular PlatoonsSadra Sadraddini, Sivaranjani S, Vijay Gupta et al.
We synthesize performance-aware safe cruise control policies for longitudinal motion of platoons of autonomous vehicles. Using set-invariance theories, we guarantee infinite-time collision avoidance in the presence of bounded additive disturbances, while ensuring that the length and the cruise speed of the platoon are bounded within specified ranges. We propose (i) a centralized control policy, and (ii) a distributed control policy, where each vehicle's control decision depends solely on its relative kinematics with respect to the platoon leader. Numerical examples are included.
OCSep 8, 2015
Passivity Degradation In Discrete Control Implementations: An Approximate Bisimulation ApproachXiangru Xu, Necmiye Ozay, Vijay Gupta
In this paper, we present some preliminary results for compositional analysis of heterogeneous systems containing both discrete state models and continuous systems using consistent notions of dissipativity and passivity. We study the following problem: given a physical plant model and a continuous feedback controller designed using traditional control techniques, how is the closed-loop passivity affected when the continuous controller is replaced by a discrete (i.e., symbolic) implementation within this framework? Specifically, we give quantitative results on performance degradation when the discrete control implementation is approximately bisimilar to the continuous controller, and based on them, we provide conditions that guarantee the boundedness property of the closed-loop system.