LGJul 9, 2024
A Complete Set of Quadratic Constraints for Repeated ReLU and GeneralizationsSahel Vahedi Noori, Bin Hu, Geir Dullerud et al.
This paper derives a complete set of quadratic constraints (QCs) for the repeated ReLU. The complete set of QCs is described by a collection of matrix copositivity conditions. We also show that only two functions satisfy all QCs in our complete set: the repeated ReLU and flipped ReLU. Thus our complete set of QCs bounds the repeated ReLU as tight as possible up to the sign invariance inherent in quadratic forms. We derive a similar complete set of incremental QCs for repeated ReLU, which can potentially lead to less conservative Lipschitz bounds for ReLU networks than the standard LipSDP approach. The basic constructions are also used to derive the complete sets of QCs for other piecewise linear activation functions such as leaky ReLU, MaxMin, and HouseHolder. Finally, we illustrate the use of the complete set of QCs to assess stability and performance for recurrent neural networks with ReLU activation functions. We rely on a standard copositivity relaxation to formulate the stability/performance condition as a semidefinite program. Simple examples are provided to illustrate that the complete sets of QCs and incremental QCs can yield less conservative bounds than existing sets.
90.9SYMar 17
Integral Quadratic Constraints for Repeated ReLUSahel Vahedi Noori, Bin Hu, Geir Dullerud et al.
This paper presents a new dynamic integral quadratic constraint (IQC) for the repeated Rectified Linear Unit (ReLU). These dynamic IQCs can be used to analyze stability and induced $\ell_2$-gain performance of discrete-time, recurrent neural networks (RNNs) with ReLU activation functions. These analysis conditions can be incorporated into learning-based controller synthesis methods, which currently rely on static IQCs. We show that our proposed dynamic IQCs for repeated ReLU form a superset of the dynamic IQCs for repeated, slope-restricted nonlinearities. We also prove that the $\ell_2$-gain bounds are nonincreasing with respect to the horizon used in the dynamic IQC filter. A numerical example using a simple (academic) RNN shows that our proposed IQCs lead to less conservative bounds than existing IQCs.
OCApr 4, 2024
Capabilities of Large Language Models in Control Engineering: A Benchmark Study on GPT-4, Claude 3 Opus, and Gemini 1.0 UltraDarioush Kevian, Usman Syed, Xingang Guo et al.
In this paper, we explore the capabilities of state-of-the-art large language models (LLMs) such as GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra in solving undergraduate-level control problems. Controls provides an interesting case study for LLM reasoning due to its combination of mathematical theory and engineering design. We introduce ControlBench, a benchmark dataset tailored to reflect the breadth, depth, and complexity of classical control design. We use this dataset to study and evaluate the problem-solving abilities of these LLMs in the context of control engineering. We present evaluations conducted by a panel of human experts, providing insights into the accuracy, reasoning, and explanatory prowess of LLMs in control engineering. Our analysis reveals the strengths and limitations of each LLM in the context of classical control, and our results imply that Claude 3 Opus has become the state-of-the-art LLM for solving undergraduate control problems. Our study serves as an initial step towards the broader goal of employing artificial general intelligence in control engineering.
SYOct 17, 2024
ControlAgent: Automating Control System Design via Novel Integration of LLM Agents and Domain ExpertiseXingang Guo, Darioush Keivan, Usman Syed et al.
Control system design is a crucial aspect of modern engineering with far-reaching applications across diverse sectors including aerospace, automotive systems, power grids, and robotics. Despite advances made by Large Language Models (LLMs) in various domains, their application in control system design remains limited due to the complexity and specificity of control theory. To bridge this gap, we introduce ControlAgent, a new paradigm that automates control system design via novel integration of LLM agents and control-oriented domain expertise. ControlAgent encodes expert control knowledge and emulates human iterative design processes by gradually tuning controller parameters to meet user-specified requirements for stability, performance, and robustness. ControlAgent integrates multiple collaborative LLM agents, including a central agent responsible for task distribution and task-specific agents dedicated to detailed controller design for various types of systems and requirements. ControlAgent also employs a Python computation agent that performs complex calculations and controller evaluations based on standard design information provided by task-specified LLM agents. Combined with a history and feedback module, the task-specific LLM agents iteratively refine controller parameters based on real-time feedback from prior designs. Overall, ControlAgent mimics the design processes used by (human) practicing engineers, but removes all the human efforts and can be run in a fully automated way to give end-to-end solutions for control system design with user-specified requirements. To validate ControlAgent's effectiveness, we develop ControlEval, an evaluation dataset that comprises 500 control tasks with various specific design goals. The effectiveness of ControlAgent is demonstrated via extensive comparative evaluations between LLM-based and traditional human-involved toolbox-based baselines.
SYMay 8, 2024
Stability and Performance Analysis of Discrete-Time ReLU Recurrent Neural NetworksSahel Vahedi Noori, Bin Hu, Geir Dullerud et al.
This paper presents sufficient conditions for the stability and $\ell_2$-gain performance of recurrent neural networks (RNNs) with ReLU activation functions. These conditions are derived by combining Lyapunov/dissipativity theory with Quadratic Constraints (QCs) satisfied by repeated ReLUs. We write a general class of QCs for repeated RELUs using known properties for the scalar ReLU. Our stability and performance condition uses these QCs along with a "lifted" representation for the ReLU RNN. We show that the positive homogeneity property satisfied by a scalar ReLU does not expand the class of QCs for the repeated ReLU. We present examples to demonstrate the stability / performance condition and study the effect of the lifting horizon.
OCFeb 18, 2024
Model-Free $μ$-Synthesis: A Nonsmooth Optimization PerspectiveDarioush Keivan, Xingang Guo, Peter Seiler et al.
In this paper, we revisit model-free policy search on an important robust control benchmark, namely $μ$-synthesis. In the general output-feedback setting, there do not exist convex formulations for this problem, and hence global optimality guarantees are not expected. Apkarian (2011) presented a nonconvex nonsmooth policy optimization approach for this problem, and achieved state-of-the-art design results via using subgradient-based policy search algorithms which generate update directions in a model-based manner. Despite the lack of convexity and global optimality guarantees, these subgradient-based policy search methods have led to impressive numerical results in practice. Built upon such a policy optimization persepctive, our paper extends these subgradient-based search methods to a model-free setting. Specifically, we examine the effectiveness of two model-free policy optimization strategies: the model-free non-derivative sampling method and the zeroth-order policy search with uniform smoothing. We performed an extensive numerical study to demonstrate that both methods consistently replicate the design outcomes achieved by their model-based counterparts. Additionally, we provide some theoretical justifications showing that convergence guarantees to stationary points can be established for our model-free $μ$-synthesis under some assumptions related to the coerciveness of the cost function. Overall, our results demonstrate that derivative-free policy optimization offers a competitive and viable approach for solving general output-feedback $μ$-synthesis problems in the model-free setting.
OCJan 3, 2022
Revisiting PGD Attacks for Stability Analysis of Large-Scale Nonlinear Systems and Perception-Based ControlAaron Havens, Darioush Keivan, Peter Seiler et al.
Many existing region-of-attraction (ROA) analysis tools find difficulty in addressing feedback systems with large-scale neural network (NN) policies and/or high-dimensional sensing modalities such as cameras. In this paper, we tailor the projected gradient descent (PGD) attack method developed in the adversarial learning community as a general-purpose ROA analysis tool for large-scale nonlinear systems and end-to-end perception-based control. We show that the ROA analysis can be approximated as a constrained maximization problem whose goal is to find the worst-case initial condition which shifts the terminal state the most. Then we present two PGD-based iterative methods which can be used to solve the resultant constrained maximization problem. Our analysis is not based on Lyapunov theory, and hence requires minimum information of the problem structures. In the model-based setting, we show that the PGD updates can be efficiently performed using back-propagation. In the model-free setting (which is more relevant to ROA analysis of perception-based control), we propose a finite-difference PGD estimate which is general and only requires a black-box simulator for generating the trajectories of the closed-loop system given any initial state. We demonstrate the scalability and generality of our analysis tool on several numerical examples with large-scale NN policies and high-dimensional image observations. We believe that our proposed analysis serves as a meaningful initial step toward further understanding of closed-loop stability of large-scale nonlinear systems and perception-based control.
LGNov 30, 2021
Model-Free $μ$ Synthesis via Adversarial Reinforcement LearningDarioush Keivan, Aaron Havens, Peter Seiler et al.
Motivated by the recent empirical success of policy-based reinforcement learning (RL), there has been a research trend studying the performance of policy-based RL methods on standard control benchmark problems. In this paper, we examine the effectiveness of policy-based RL methods on an important robust control problem, namely $μ$ synthesis. We build a connection between robust adversarial RL and $μ$ synthesis, and develop a model-free version of the well-known $DK$-iteration for solving state-feedback $μ$ synthesis with static $D$-scaling. In the proposed algorithm, the $K$ step mimics the classical central path algorithm via incorporating a recently-developed double-loop adversarial RL method as a subroutine, and the $D$ step is based on model-free finite difference approximation. Extensive numerical study is also presented to demonstrate the utility of our proposed model-free algorithm. Our study sheds new light on the connections between adversarial RL and robust control.
OCNov 24, 2020
Policy Optimization for Markovian Jump Linear Quadratic Control: Gradient-Based Methods and Global ConvergenceJoao Paulo Jansch-Porto, Bin Hu, Geir Dullerud
Recently, policy optimization for control purposes has received renewed attention due to the increasing interest in reinforcement learning. In this paper, we investigate the global convergence of gradient-based policy optimization methods for quadratic optimal control of discrete-time Markovian jump linear systems (MJLS). First, we study the optimization landscape of direct policy optimization for MJLS, with static state feedback controllers and quadratic performance costs. Despite the non-convexity of the resultant problem, we are still able to identify several useful properties such as coercivity, gradient dominance, and almost smoothness. Based on these properties, we show global convergence of three types of policy optimization methods: the gradient descent method; the Gauss-Newton method; and the natural policy gradient method. We prove that all three methods converge to the optimal state feedback controller for MJLS at a linear rate if initialized at a controller which is mean-square stabilizing. Some numerical examples are presented to support the theory. This work brings new insights for understanding the performance of policy gradient methods on the Markovian jump linear quadratic control problem.
OCJun 4, 2020
Policy Learning of MDPs with Mixed Continuous/Discrete Variables: A Case Study on Model-Free Control of Markovian Jump SystemsJoao Paulo Jansch-Porto, Bin Hu, Geir Dullerud
Markovian jump linear systems (MJLS) are an important class of dynamical systems that arise in many control applications. In this paper, we introduce the problem of controlling unknown (discrete-time) MJLS as a new benchmark for policy-based reinforcement learning of Markov decision processes (MDPs) with mixed continuous/discrete state variables. Compared with the traditional linear quadratic regulator (LQR), our proposed problem leads to a special hybrid MDP (with mixed continuous and discrete variables) and poses significant new challenges due to the appearance of an underlying Markov jump parameter governing the mode of the system dynamics. Specifically, the state of a MJLS does not form a Markov chain and hence one cannot study the MJLS control problem as a MDP with solely continuous state variable. However, one can augment the state and the jump parameter to obtain a MDP with a mixed continuous/discrete state space. We discuss how control theory sheds light on the policy parameterization of such hybrid MDPs. Then we modify the widely used natural policy gradient method to directly learn the optimal state feedback control policy for MJLS without identifying either the system dynamics or the transition probability of the switching parameter. We implement the (data-driven) natural policy gradient method on different MJLS examples. Our simulation results suggest that the natural gradient method can efficiently learn the optimal controller for MJLS with unknown dynamics.
OCFeb 10, 2020
Convergence Guarantees of Policy Optimization Methods for Markovian Jump Linear SystemsJoao Paulo Jansch-Porto, Bin Hu, Geir Dullerud
Recently, policy optimization for control purposes has received renewed attention due to the increasing interest in reinforcement learning. In this paper, we investigate the convergence of policy optimization for quadratic control of Markovian jump linear systems (MJLS). First, we study the optimization landscape of direct policy optimization for MJLS, and, in particular, show that despite the non-convexity of the resultant problem the unique stationary point is the global optimal solution. Next, we prove that the Gauss-Newton method and the natural policy gradient method converge to the optimal state feedback controller for MJLS at a linear rate if initialized at a controller which stabilizes the closed-loop dynamics in the mean square sense. We propose a novel Lyapunov argument to fix a key stability issue in the convergence proof. Finally, we present a numerical example to support our theory. Our work brings new insights for understanding the performance of policy learning methods on controlling unknown MJLS.
LGNov 4, 2019
Verification and Parameter Synthesis for Stochastic Systems using Optimistic OptimizationNegin Musavi, Dawei Sun, Sayan Mitra et al.
We present an algorithm for formal verification and parameter synthesis of continuous state-space Markov chains. This class of problems captures the design and analysis of a wide variety of autonomous and cyber-physical systems defined by nonlinear and black-box modules. In order to solve these problems, one has to maximize certain probabilistic objective functions overall choices of initial states and parameters. In this paper, we identify the assumptions that make it possible to view this problem as a multi-armed bandit problem. Based on this fresh perspective, we propose an algorithm (HOO-MB) for solving the problem that carefully instantiates an existing bandit algorithm -- Hierarchical Optimistic Optimization -- with appropriate parameters. As a consequence, we obtain theoretical regret bounds on sample efficiency of our solution that depends on key problem parameters like smoothness, near-optimality dimension, and batch size. The batch size parameter enables us to strike a balance between the sample efficiency and the memory usage of the algorithm. Our experiments, using the tool HooVer, suggest that the approach scales to realistic-sized problems and is often more sample-efficient compared to PlasmaLab -- a leading tool for verification of stochastic systems. Specifically, HooVer has distinct advantages in analyzing models in which the objective function has sharp slopes. In addition, HooVer shows promising behavior in parameter synthesis for a linear quadratic regulator (LQR) example.
ROOct 3, 2019
CyPhyHouse: A Programming, Simulation, and Deployment Toolchain for Heterogeneous Distributed CoordinationRitwika Ghosh, Joao P. Jansch-Porto, Chiao Hsieh et al.
Programming languages, libraries, and development tools have transformed the application development processes for mobile computing and machine learning. This paper introduces the CyPhyHouse - a toolchain that aims to provide similar programming, debugging, and deployment benefits for distributed mobile robotic applications. Users can develop hardware-agnostic, distributed applications using the high-level, event driven Koord programming language, without requiring expertise in controller design or distributed network protocols. The modular, platform-independent middleware of CyPhyHouse implements these functionalities using standard algorithms for path planning (RRT), control (MPC), mutual exclusion, etc. A high-fidelity, scalable, multi-threaded simulator for Koord applications is developed to simulate the same application code for dozens of heterogeneous agents. The same compiled code can also be deployed on heterogeneous mobile platforms. The effectiveness of CyPhyHouse in improving the design cycles is explicitly illustrated in a robotic testbed through development, simulation, and deployment of a distributed task allocation application on in-house ground and aerial vehicles.
SYJan 18, 2015
Controller Synthesis for Linear Time-varying Systems with AdversariesZhenqi Huang, Yu Wang, Sayan Mitra et al.
We present a controller synthesis algorithm for a discrete time reach-avoid problem in the presence of adversaries. Our model of the adversary captures typical malicious attacks envisioned on cyber-physical systems such as sensor spoofing, controller corruption, and actuator intrusion. After formulating the problem in a general setting, we present a sound and complete algorithm for the case with linear dynamics and an adversary with a budget on the total L2-norm of its actions. The algorithm relies on a result from linear control theory that enables us to decompose and precisely compute the reachable states of the system in terms of a symbolic simulation of the adversary-free dynamics and the total uncertainty induced by the adversary. With this decomposition, the synthesis problem eliminates the universal quantifier on the adversary's choices and the symbolic controller actions can be effectively solved using an SMT solver. The constraints induced by the adversary are computed by solving second-order cone programmings. The algorithm is later extended to synthesize state-dependent controller and to generate attacks for the adversary. We present preliminary experimental results that show the effectiveness of this approach on several example problems.
CRJul 18, 2012
Differentially Private Iterative Synchronous ConsensusZhenqi Huang, Sayan Mitra, Geir Dullerud
The iterative consensus problem requires a set of processes or agents with different initial values, to interact and update their states to eventually converge to a common value. Protocols solving iterative consensus serve as building blocks in a variety of systems where distributed coordination is required for load balancing, data aggregation, sensor fusion, filtering, clock synchronization and platooning of autonomous vehicles. In this paper, we introduce the private iterative consensus problem where agents are required to converge while protecting the privacy of their initial values from honest but curious adversaries. Protecting the initial states, in many applications, suffice to protect all subsequent states of the individual participants. First, we adapt the notion of differential privacy in this setting of iterative computation. Next, we present a server-based and a completely distributed randomized mechanism for solving private iterative consensus with adversaries who can observe the messages as well as the internal states of the server and a subset of the clients. Finally, we establish the tradeoff between privacy and the accuracy of the proposed randomized mechanism.