John Lygeros

SY
h-index61
77papers
857citations
Novelty50%
AI Score57

77 Papers

OCSep 30, 2016
On Submodularity and Controllability in Complex Dynamical Networks

Tyler H. Summers, Fabrizio L. Cortesi, John Lygeros

Controllability and observability have long been recognized as fundamental structural properties of dynamical systems, but have recently seen renewed interest in the context of large, complex networks of dynamical systems. A basic problem is sensor and actuator placement: choose a subset from a finite set of possible placements to optimize some real-valued controllability and observability metrics of the network. Surprisingly little is known about the structure of such combinatorial optimization problems. In this paper, we show that several important classes of metrics based on the controllability and observability Gramians have a strong structural property that allows for either efficient global optimization or an approximation guarantee by using a simple greedy heuristic for their maximization. In particular, the mapping from possible placements to several scalar functions of the associated Gramian is either a modular or submodular set function. The results are illustrated on randomly generated systems and on a problem of power electronic actuator placement in a model of the European power grid.

SYMay 8, 2017
Aggregation and Disaggregation of Energetic Flexibility from Distributed Energy Resources

Fabian L. Müller, Jácint Szabó, Olle Sundström et al.

A variety of energy resources has been identified as being flexible in their electric energy consumption or generation. This energetic flexibility can be used for various purposes such as minimizing energy procurement costs or providing ancillary services to power grids. To fully leverage the flexibility available from distributed small-scale resources, their flexibility must be quantified and aggregated. This paper introduces a generic and scalable approach for flexible energy systems to quantitatively describe and price their flexibility based on zonotopic sets. The description proposed allows aggregators to efficiently pool the flexibility of large numbers of systems and to make control and market decisions on the aggregate level. In addition, an algorithm is presented that distributes aggregate-level control decisions among the individual systems of the pool in an economically fair and computationally efficient way. Finally, it is shown how the zonotopic description of flexibility enables an efficient computation of aggregate regulation power bid-curves.

OCFeb 15, 2013
Symbolic control of stochastic systems via approximately bisimilar finite abstractions

Majid Zamani, Peyman Mohajerin Esfahani, Rupak Majumdar et al.

Symbolic approaches to the control design over complex systems employ the construction of finite-state models that are related to the original control systems, then use techniques from finite-state synthesis to compute controllers satisfying specifications given in a temporal logic, and finally translate the synthesized schemes back as controllers for the concrete complex systems. Such approaches have been successfully developed and implemented for the synthesis of controllers over non-probabilistic control systems. In this paper, we extend the technique to probabilistic control systems modeled by controlled stochastic differential equations. We show that for every stochastic control system satisfying a probabilistic variant of incremental input-to-state stability, and for every given precision $\varepsilon>0$, a finite-state transition system can be constructed, which is $\varepsilon$-approximately bisimilar (in the sense of moments) to the original stochastic control system. Moreover, we provide results relating stochastic control systems to their corresponding finite-state transition systems in terms of probabilistic bisimulation relations known in the literature. We demonstrate the effectiveness of the construction by synthesizing controllers for stochastic control systems over rich specifications expressed in linear temporal logic. The discussed technique enables a new, automated, correct-by-construction controller synthesis approach for stochastic control systems, which are common mathematical models employed in many safety critical systems subject to structured uncertainty and are thus relevant for cyber-physical applications.

SYApr 30, 2018
Nash and Wardrop equilibria in aggregative games with coupling constraints

Dario Paccagnan, Basilio Gentile, Francesca Parise et al.

We consider the framework of aggregative games, in which the cost function of each agent depends on his own strategy and on the average population strategy. As first contribution, we investigate the relations between the concepts of Nash and Wardrop equilibrium. By exploiting a characterization of the two equilibria as solutions of variational inequalities, we bound their distance with a decreasing function of the population size. As second contribution, we propose two decentralized algorithms that converge to such equilibria and are capable of coping with constraints coupling the strategies of different agents. Finally, we study the applications of charging of electric vehicles and of route choice on a road network.

OCJun 7, 2016
Robust Optimal Control with Adjustable Uncertainty Sets

Xiaojing Zhang, Maryam Kamgarpour, Angelos Georghiou et al.

In this paper, we develop a unified framework for studying constrained robust optimal control problems with adjustable uncertainty sets. In contrast to standard constrained robust optimal control problems with known uncertainty sets, we treat the uncertainty sets in our problems as additional decision variables. In particular, given a finite prediction horizon and a metric for adjusting the uncertainty sets, we address the question of determining the optimal size and shape of the uncertainty sets, while simultaneously ensuring the existence of a control policy that will keep the system within its constraints for all possible disturbance realizations inside the adjusted uncertainty set. Since our problem subsumes the classical constrained robust optimal control design problem, it is computationally intractable in general. We demonstrate that by restricting the families of admissible uncertainty sets and control policies, the problem can be formulated as a tractable convex optimization problem. We show that our framework captures several families of (convex) uncertainty sets of practical interest, and illustrate our approach on a demand response problem of providing control reserves for a power system.

SYMar 18, 2019
Data-Enabled Predictive Control for Grid-Connected Power Converters

Linbin Huang, Jeremy Coulson, John Lygeros et al.

We apply a novel data-enabled predictive control (DeePC) algorithm in grid-connected power converters to perform safe and optimal control. Rather than a model, the DeePC algorithm solely needs input/output data measured from the unknown system to predict future trajectories. We show that the DeePC can eliminate undesired oscillations in a grid-connected power converter and stabilize an unstable system. However, the DeePC algorithm may suffer from poor scalability when applied in high-order systems. To this end, we present a finite-horizon output-based model predictive control (MPC) for grid-connected power converters, which uses an N-step auto-regressive-moving-average (ARMA) model for system representation. The ARMA model is identified via an N-step prediction error method (PEM) in a recursive way. We investigate the connection between the DeePC and the concatenated PEM-MPC method, and then analytically and numerically compare their closed-loop performance. Moreover, the PEM-MPC is applied in a voltage source converter based HVDC station which is connected to a two-area power system so as to eliminate low-frequency oscillations. All of our results are illustrated with high-fidelity, nonlinear, and noisy simulations.

OCDec 6, 2012
Distributed Model Predictive Consensus via the Alternating Direction Method of Multipliers

Tyler H. Summers, John Lygeros

We propose a distributed optimization method for solving a distributed model predictive consensus problem. The goal is to design a distributed controller for a network of dynamical systems to optimize a coupled objective function while respecting state and input constraints. The distributed optimization method is an augmented Lagrangian method called the Alternating Direction Method of Multipliers (ADMM), which was introduced in the 1970s but has seen a recent resurgence in the context of dramatic increases in computing power and the development of widely available distributed computing platforms. The method is applied to position and velocity consensus in a network of double integrators. We find that a few tens of ADMM iterations yield closed-loop performance near what is achieved by solving the optimization problem centrally. Furthermore, the use of recent code generation techniques for solving local subproblems yields fast overall computation times.

OCJan 22, 2016
The Stochastic Reach-Avoid Problem and Set Characterization for Diffusions

Peyman Mohajerin Esfahani, Debasish Chatterjee, John Lygeros

In this article we approach a class of stochastic reachability problems with state constraints from an optimal control perspective. Preceding approaches to solving these reachability problems are either confined to the deterministic setting or address almost-sure stochastic requirements. In contrast, we propose a methodology to tackle problems with less stringent requirements than almost sure. To this end, we first establish a connection between two distinct stochastic reach-avoid problems and three classes of stochastic optimal control problems involving discontinuous payoff functions. Subsequently, we focus on solutions of one of the classes of stochastic optimal control problems---the exit-time problem, which solves both the two reach-avoid problems mentioned above. We then derive a weak version of a dynamic programming principle (DPP) for the corresponding value function; in this direction our contribution compared to the existing literature is to develop techniques that admit discontinuous payoff functions. Moreover, based on our DPP, we provide an alternative characterization of the value function as a solution of a partial differential equation in the sense of discontinuous viscosity solutions, along with boundary conditions both in Dirichlet and viscosity senses. Theoretical justifications are also discussed to pave the way for deployment of off-the-shelf PDE solvers for numerical computations. Finally, we validate the performance of the proposed framework on the stochastic Zermelo navigation problem.

OCJan 22, 2016
A Tractable Fault Detection and Isolation Approach for Nonlinear Systems with Probabilistic Performance

Peyman Mohajerin Esfahani, John Lygeros

This article presents a novel perspective along with a scalable methodology to design a fault detection and isolation (FDI) filter for high dimensional nonlinear systems. Previous approaches on FDI problems are either confined to linear systems or they are only applicable to low dimensional dynamics with specific structures. In contrast, shifting attention from the system dynamics to the disturbance inputs, we propose a relaxed design perspective to train a linear residual generator given some statistical information about the disturbance patterns. That is, we propose an optimization-based approach to robustify the filter with respect to finitely many signatures of the nonlinearity. We then invoke recent results in randomized optimization to provide theoretical guarantees for the performance of the proposed filer. Finally, motivated by a cyber-physical attack emanating from the vulnerabilities introduced by the interaction between IT infrastructure and power system, we deploy the developed theoretical results to detect such an intrusion before the functionality of the power system is disrupted.

OCJul 19, 2016
The Power of Diversity: Data-Driven Robust Predictive Control for Energy Efficient Buildings and Districts

Georgios Darivianakis, Angelos Georghiou, Roy S. Smith et al.

The cooperative energy management of aggregated buildings has recently received a great deal of interest due to substantial potential energy savings. These gains are mainly obtained in two ways: (i) Exploiting the load shifting capabilities of the cooperative buildings; (ii) Utilizing the expensive but energy efficient equipment that is commonly shared by the building community (e.g., heat pumps, batteries and photovoltaics). Several deterministic and stochastic control schemes that strive to realize these savings, have been proposed in the literature. A common difficulty with all these methods is integrating knowledge about the disturbances affecting the system. In this context, the underlying disturbance distributions are often poorly characterized based on historical data. In this paper, we address this issue by exploiting the historical data to construct families of distributions which contain these underlying distributions with high confidence. We then employ tools from data-driven robust optimization to formulate a multistage stochastic optimization problem which can be approximated by a finite-dimensional linear program. The proposed method is suitable for tackling large scale systems since its complexity grows polynomially with respect to the system variables. We demonstrate its efficacy in a numerical study, in which it is shown to outperform, in terms of energy cost savings and constraint violations, established solution techniques from the literature. We conclude this study by showing the significant energy gains that are obtained by cooperatively managing a collection of buildings with heterogeneous characteristics.

OCFeb 20, 2017
From Infinite to Finite Programs: Explicit Error Bounds with Applications to Approximate Dynamic Programming

Peyman Mohajerin Esfahani, Tobias Sutter, Daniel Kuhn et al.

We consider linear programming (LP) problems in infinite dimensional spaces that are in general computationally intractable. Under suitable assumptions, we develop an approximation bridge from the infinite-dimensional LP to tractable finite convex programs in which the performance of the approximation is quantified explicitly. To this end, we adopt the recent developments in two areas of randomized optimization and first order methods, leading to a priori as well as a posterior performance guarantees. We illustrate the generality and implications of our theoretical results in the special case of the long-run average cost and discounted cost optimal control problems for Markov decision processes on Borel spaces. The applicability of the theoretical results is demonstrated through a constrained linear quadratic optimal control problem and a fisheries management problem.

OCDec 6, 2012
Approximate Dynamic Programming via Sum of Squares Programming

Tyler H. Summers, Konstantin Kunz, Nikolaos Kariotoglou et al.

We describe an approximate dynamic programming method for stochastic control problems on infinite state and input spaces. The optimal value function is approximated by a linear combination of basis functions with coefficients as decision variables. By relaxing the Bellman equation to an inequality, one obtains a linear program in the basis coefficients with an infinite set of constraints. We show that a recently introduced method, which obtains convex quadratic value function approximations, can be extended to higher order polynomial approximations via sum of squares programming techniques. An approximate value function can then be computed offline by solving a semidefinite program, without having to sample the infinite constraint. The policy is evaluated online by solving a polynomial optimization problem, which also turns out to be convex in some cases. We experimentally validate the method on an autonomous helicopter testbed using a 10-dimensional helicopter model.

OCMar 20, 2018
Distributed Model Predictive Control for Linear Systems with Adaptive Terminal Sets

Georgios Darivianakis, Annika Eichler, John Lygeros

In this paper, we propose a distributed model predictive control (DMPC) scheme for linear time-invariant constrained systems which admit a separable structure. To exploit the merits of distributed computation algorithms, the stabilizing terminal controller, value function and invariant terminal set of the DMPC optimization problem need to respect the loosely coupled structure of the system. Although existing methods in the literature address this task, they typically decouple the synthesis of terminal controllers and value functions from the one of terminal sets. In addition, these approaches do not explicitly consider the effect of the current state of the system in the synthesis process. These limitations can lead the resulting DMPC scheme to poor performance since it may admit small or even empty terminal sets. Unlike other approaches, this paper presents a unified framework to encapsulate the synthesis of both the stabilizing terminal controller and invariant terminal set into the DMPC formulation. Conditions for Lyapunov stability and invariance are imposed in the synthesis problem in a way that allows the value function and invariant terminal set to admit the desired distributed structure. We illustrate the effectiveness of the proposed method on several examples including a benchmark spring-mass-damper problem.

SYJun 20, 2018
Unlocking the Potential of Flexible Energy Resources to Help Balance the Power Grid

Fabian L. Müller, Stefan Woerner, John Lygeros

Flexible energy resources can help balance the power grid by providing different types of ancillary services. However, the balancing potential of most types of resources is restricted by physical constraints such as the size of their energy buffer, limits on power-ramp rates, or control delays. Using the example of Secondary Frequency Regulation, this paper shows how the flexibility of various resources can be exploited more efficiently by considering multiple resources with complementary physical properties and controlling them in a coordinated way. To this end, optimal adjustable control policies are computed based on robust optimization. Our problem formulation takes into account power ramp-rate constraints explicitly, and accurately models the different timescales and lead times of the energy and reserve markets. Simulations demonstrate that aggregations of select resources can offer significantly more regulation capacity than the resources could provide individually.

SYOct 17, 2017
A distributed algorithm for average aggregative games with coupling constraints

Francesca Parise, Basilio Gentile, John Lygeros

We consider the framework of average aggregative games, where the cost function of each agent depends on his own strategy and on the average population strategy. We focus on the case in which the agents are coupled not only via their cost functions, but also via constraints coupling their strategies. We propose a distributed algorithm that achieves an almost Nash equilibrium by requiring only local communications of the agents, as specified by a sparse communication network. The proof of convergence of the algorithm relies on the auxiliary class of network aggregative games and exploits a novel result of parametric convergence of variational inequalities, which is applicable beyond the context of games. We apply our theoretical findings to a multi-market Cournot game with transportation costs and maximum market capacity.

OCJan 22, 2016
Motion Planning for Continuous Time Stochastic Processes: A Dynamic Programming Approach

Peyman Mohajerin Esfahani, Debasish Chatterjee, John Lygeros

We study stochastic motion planning problems which involve a controlled process, with possibly discontinuous sample paths, visiting certain subsets of the state-space while avoiding others in a sequential fashion. For this purpose, we first introduce two basic notions of motion planning, and then establish a connection to a class of stochastic optimal control problems concerned with sequential stopping times. A weak dynamic programming principle (DPP) is then proposed, which characterizes the set of initial states that admit a control enabling the process to execute the desired maneuver with probability no less than some pre-specified value. The proposed DPP comprises auxiliary value functions defined in terms of discontinuous payoff functions. A concrete instance of the use of this novel DPP in the case of diffusion processes is also presented. In this case, we establish that the aforementioned set of initial states can be characterized as the level set of a discontinuous viscosity solution to a sequence of partial differential equations, for which the first one has a known boundary condition, while the boundary conditions of the subsequent ones are determined by the solutions to the preceding steps. Finally, the generality and flexibility of the theoretical results are illustrated on an example involving biological switches.

SYMay 15, 2018
On the Efficiency of Nash Equilibria in Aggregative Charging Games

Dario Paccagnan, Francesca Parise, John Lygeros

Several works have recently suggested to model the problem of coordinating the charging needs of a fleet of electric vehicles as a game, and have proposed distributed algorithms to coordinate the vehicles towards a Nash equilibrium of such game. However, Nash equilibria have been shown to posses desirable system-level properties only in simplified cases. In this work, we use the concept of price of anarchy to analyze the inefficiency of Nash equilibria when compared to the social optimum solution. More precisely, we show that i) for linear price functions depending on all the charging instants, the price of anarchy converges to one as the population of vehicles grows; ii) for price functions that depend only on the instantaneous demand, the price of anarchy converges to one if the price function takes the form of a positive pure monomial; iii) for general classes of price functions, the asymptotic price of anarchy can be bounded. For finite populations, we additionaly provide a bound on the price of anarchy as a function of the number vehicles in the system. We support the theoretical findings by means of numerical simulations.

LGMay 24, 2022
Advanced Manufacturing Configuration by Sample-efficient Batch Bayesian Optimization

Xavier Guidetti, Alisa Rupenyan, Lutz Fassl et al.

We propose a framework for the configuration and operation of expensive-to-evaluate advanced manufacturing methods, based on Bayesian optimization. The framework unifies a tailored acquisition function, a parallel acquisition procedure, and the integration of process information providing context to the optimization procedure. \cmtb{The novel acquisition function is demonstrated, analyzed and compared on state-of-the-art benchmarking problems. We apply the optimization approach to atmospheric plasma spraying and fused deposition modeling.} Our results demonstrate that the proposed framework can efficiently find input parameters that produce the desired outcome and minimize the process cost.

OCOct 9, 2016
Strong Stationarity Conditions for Optimal Control of Hybrid Systems

Andreas B. Hempel, Paul Goulart, John Lygeros

We present necessary and sufficient optimality conditions for finite time optimal control problems for a class of hybrid systems described by linear complementarity models. Although these optimal control problems are difficult in general due to the presence of complementarity constraints, we provide a set of structural assumptions ensuring that the tangent cone of the constraints possesses geometric regularity properties. These imply that the classical Karush-Kuhn-Tucker conditions of nonlinear programming theory are both necessary and sufficient for local optimality, which is not the case for general mathematical programs with complementarity constraints. We also present sufficient conditions for global optimality. We proceed to show that the dynamics of every continuous piecewise affine system can be written as the optimizer of a mathematical program which results in a linear complementarity model satisfying our structural assumptions. Hence, our stationarity results apply to a large class of hybrid systems with piecewise affine dynamics. We present simulation results showing the substantial benefits possible from using a nonlinear programming approach to the optimal control problem with complementarity constraints instead of a more traditional mixed-integer formulation.

SYNov 14, 2022
Implications of Regret on Stability of Linear Dynamical Systems

Aren Karapetyan, Anastasios Tsiamis, Efe C. Balta et al.

The setting of an agent making decisions under uncertainty and under dynamic constraints is common for the fields of optimal control, reinforcement learning, and recently also for online learning. In the online learning setting, the quality of an agent's decision is often quantified by the concept of regret, comparing the performance of the chosen decisions to the best possible ones in hindsight. While regret is a useful performance measure, when dynamical systems are concerned, it is important to also assess the stability of the closed-loop system for a chosen policy. In this work, we show that for linear state feedback policies and linear systems subject to adversarial disturbances, linear regret implies asymptotic stability in both time-varying and time-invariant settings. Conversely, we also show that bounded input bounded state stability and summability of the state transition matrices imply linear regret.

SYApr 24, 2023
Stochastic MPC for energy hubs using data driven demand forecasting

Varsha Behrunani, Francesco Micheli, Jonas Mehr et al.

Energy hubs convert and distribute energy resources by combining different energy inputs through multiple conversion and storage components. The optimal operation of the energy hub exploits its flexibility to increase the energy efficiency and reduce the operational costs. However, uncertainties in the demand present challenges to energy hub optimization. In this paper, we propose a stochastic MPC controller to minimize energy costs using chance constraints for the uncertain electricity and thermal demands. Historical data is used to build a demand prediction model based on Gaussian processes to generate a forecast of the future electricity and heat demands. The stochastic optimization problem is solved via the Scenario Approach by sampling multi-step demand trajectories from the derived prediction model. The performance of the proposed predictor and of the stochastic controller is verified on a simulated energy hub model and demand data from a real building.

OCJun 27, 2021
Decentralized decision making for networks of uncertain systems

Georgios Darivianakis, Angelos Georghiou, John Lygeros

Distributed model predictive control (MPC) has been proven a successful method in regulating the operation of large-scale networks of constrained dynamical systems. This paper is concerned with cooperative distributed MPC in which the decision actions of the systems are usually derived by the solution of a system-wide optimization problem. However, formulating and solving such large-scale optimization problems is often a hard task which requires extensive information communication among the individual systems and fails to address privacy concerns in the network. Hence, the main challenge is to design decision policies with a prescribed structure so that the resulting system-wide optimization problem to admit a loosely coupled structure and be amendable to distributed computation algorithms. In this paper, we propose a decentralized problem synthesis scheme which only requires each system to communicate sets which bound its states evolution to neighboring systems. The proposed method alleviates concerns on privacy since this limited communication scheme does not reveal the exact characteristics of the dynamics within each system. In addition, it enables a distributed computation of the solution, making our method highly scalable. We demonstrate in a number of numerical studies, inspired by engineering and finance, the efficacy of the proposed approach which leads to solutions that closely approximate those obtained by the centralized formulation only at a fraction of the computational effort.

OCApr 24, 2023
Designing Optimal Personalized Incentive for Traffic Routing using BIG Hype algorithm

Panagiotis D. Grontas, Carlo Cenedese, Marta Fochesato et al.

We study the problem of optimally routing plug-in electric and conventional fuel vehicles on a city level. In our model, commuters selfishly aim to minimize a local cost that combines travel time, from a fixed origin to a desired destination, and the monetary cost of using city facilities, parking or service stations. The traffic authority can influence the commuters' preferred routing choice by means of personalized discounts on parking tickets and on the energy price at service stations. We formalize the problem of designing these monetary incentives optimally as a large-scale bilevel game, where constraints arise at both levels due to the finite capacities of city facilities and incentives budget. Then, we develop an efficient decentralized solution scheme with convergence guarantees based on BIG Hype, a recently-proposed hypergradient-based algorithm for hierarchical games. Finally, we validate our model via numerical simulations over the Anaheim's network, and show that the proposed approach produces sensible results in terms of traffic decongestion and it is able to solve in minutes problems with more than 48000 variables and 110000 constraints.

SYNov 18, 2022
Optimal service station design for traffic mitigation via genetic algorithm and neural network

Carlo Cenedese, Michele Cucuzzella, Adriano Cotta Ramusino et al.

This paper analyzes how the presence of service stations on highways affects traffic congestion. We focus on the problem of optimally designing a service station to achieve beneficial effects in terms of total traffic congestion and peak traffic reduction. Microsimulators cannot be used for this task due to their computational inefficiency. We propose a genetic algorithm based on the recently proposed CTMs, that efficiently describes the dynamics of a service station. Then, we leverage the algorithm to train a neural network capable of solving the same problem, avoiding implementing the CTMs. Finally, we examine two case studies to validate the capabilities and performance of our algorithms. In these simulations, we use real data extracted from Dutch highways.

SYNov 14, 2022
Follow the Clairvoyant: an Imitation Learning Approach to Optimal Control

Andrea Martin, Luca Furieri, Florian Dörfler et al.

We consider control of dynamical systems through the lens of competitive analysis. Most prior work in this area focuses on minimizing regret, that is, the loss relative to an ideal clairvoyant policy that has noncausal access to past, present, and future disturbances. Motivated by the observation that the optimal cost only provides coarse information about the ideal closed-loop behavior, we instead propose directly minimizing the tracking error relative to the optimal trajectories in hindsight, i.e., imitating the clairvoyant policy. By embracing a system level perspective, we present an efficient optimization-based approach for computing follow-the-clairvoyant (FTC) safe controllers. We prove that these attain minimal regret if no constraints are imposed on the noncausal benchmark. In addition, we present numerical experiments to show that our policy retains the hallmark of competitive algorithms of interpolating between classical $\mathcal{H}_2$ and $\mathcal{H}_\infty$ control laws - while consistently outperforming regret minimization methods in constrained scenarios thanks to the superior ability to chase the clairvoyant.

SYApr 10, 2022
Regret Analysis of Online Gradient Descent-based Iterative Learning Control with Model Mismatch

Efe C. Balta, Andrea Iannelli, Roy S. Smith et al.

In Iterative Learning Control (ILC), a sequence of feedforward control actions is generated at each iteration on the basis of partial model knowledge and past measurements with the goal of steering the system toward a desired reference trajectory. This is framed here as an online learning task, where the decision-maker takes sequential decisions by solving a sequence of optimization problems having only partial knowledge of the cost functions. Having established this connection, the performance of an online gradient-descent based scheme using inexact gradient information is analyzed in the setting of dynamic and static regret, standard measures in online learning. Fundamental limitations of the scheme and its integration with adaptation mechanisms are further investigated, followed by numerical simulations on a benchmark ILC problem.

SYFeb 18, 2019
Nonlinear Control of Quadcopters via Approximate Dynamic Programming

Angel Romero, Paul N. Beuchat, Yvonne R. Stürz et al.

While Approximate Dynamic Programming has successfully been used in many applications involving discrete states and inputs such as playing the games of Tetris or chess, it has not been used in many continuous state and input space applications. In this paper, we combine Approximate Dynamic Programming techniques and apply them to the continuous, non-linear and high dimensional dynamics of a quadcopter vehicle. We use a polynomial approximation of the dynamics and sum-of-squares programming techniques to compute a family of polynomial value function approximations for different tuning parameters. The resulting approximations to the optimal value function are combined in a point-wise maximum approach, which is used to compute the online policy. The success of the method is demonstrated in both simulations and experiments on a quadcopter. The control performance is compared to a linear time-varying Model Predictive Controller. The two methods are then combined to keep the computational benefits of a short horizon MPC and the long term performance benefits of the Approximate Dynamic Programming value function as the terminal cost.

SYAug 30, 2018
Performance guarantees for model-based Approximate Dynamic Programming in continuous spaces

Paul N. Beuchat, Angelos Georghiou, John Lygeros

We study both the value function and Q-function formulation of the Linear Programming approach to Approximate Dynamic Programming. The approach is model-based and optimizes over a restricted function space to approximate the value function or Q-function. Working in the discrete time, continuous space setting, we provide guarantees for the fitting error and online performance of the policy. In particular, the online performance guarantee is obtained by analyzing an iterated version of the greedy policy, and the fitting error guarantee by analyzing an iterated version of the Bellman inequality. These guarantees complement the existing bounds that appear in the literature. The Q-function formulation offers benefits, for example, in decentralized controller design, however it can lead to computationally demanding optimization problems. To alleviate this drawback, we provide a condition that simplifies the formulation, resulting in improved computational times.

SYApr 11, 2018
Ramp-Rate-Constrained Bidding of Energy and Frequency Reserves in Real Market Settings

Fabian L. Müller, Stefan Woerner, John Lygeros

The energetic flexibility of electric energy resources can be exploited when trading on wholesale energy and ancillary service markets. This paper considers the problem of a Balance Responsible Party to maximize its profit from trading on energy markets while simultaneously offering Secondary Frequency Reserves to the System Operator. However, the accurate provision of regulation power can be compromised by power ramp-rate limitations of the resources providing the service. To avoid this shortcoming, we take power ramp-rate constraints into account explicitly when computing optimal adjustable energy trading policies that are robust against uncertain activation of reserves. The approach proposed is applicable in real market settings because it models all the different timescales of the day-ahead, intra-day, and reserve markets. Finally, the effect of different market settings and energy trading policies on the amount of available Secondary Frequency Reserves is investigated.

OCOct 6, 2017
The Linear Programming Approach to Reach-Avoid Problems for Markov Decision Processes

Nikolaos Kariotoglou, Maryam Kamgarpour, Tyler Summers et al.

One of the most fundamental problems in Markov decision processes is analysis and control synthesis for safety and reachability specifications. We consider the stochastic reach-avoid problem, in which the objective is to synthesize a control policy to maximize the probability of reaching a target set at a given time, while staying in a safe set at all prior times. We characterize the solution to this problem through an infinite dimensional linear program. We then develop a tractable approximation to the infinite dimensional linear program through finite dimensional approximations of the decision space and constraints. For a large class of Markov decision processes modeled by Gaussian mixtures kernels we show that through a proper selection of the finite dimensional space, one can further reduce the computational complexity of the resulting linear program. We validate the proposed method and analyze its potential with a series of numerical case studies.

DSNov 15, 2012
Isospectral flows on a class of finite-dimensional Jacobi matrices

Tobias Sutter, Debasish Chatterjee, Federico Ramponi et al.

We present a new matrix-valued isospectral ordinary differential equation that asymptotically block-diagonalizes $n\times n$ zero-diagonal Jacobi matrices employed as its initial condition. This o.d.e.\ features a right-hand side with a nested commutator of matrices, and structurally resembles the double-bracket o.d.e.\ studied by R.W.\ Brockett in 1991. We prove that its solutions converge asymptotically, that the limit is block-diagonal, and above all, that the limit matrix is defined uniquely as follows: For $n$ even, a block-diagonal matrix containing $2\times 2$ blocks, such that the super-diagonal entries are sorted by strictly increasing absolute value. Furthermore, the off-diagonal entries in these $2\times 2$ blocks have the same sign as the respective entries in the matrix employed as initial condition. For $n$ odd, there is one additional $1\times 1$ block containing a zero that is the top left entry of the limit matrix. The results presented here extend some early work of Kac and van Moerbeke.

SYMar 30, 2016
Distributed Learning in the Presence of Disturbances

Chithrupa Ramesh, Marius Schmitt, John Lygeros

We consider a problem where multiple agents must learn an action profile that maximises the sum of their utilities in a distributed manner. The agents are assumed to have no knowledge of either the utility functions or the actions and payoffs of other agents. These assumptions arise when modelling the interactions in a complex system and communicating between various components of the system are both difficult. In [1], a distributed algorithm was proposed, which learnt Pareto-efficient solutions in this problem setting. However, the approach assumes that all agents can choose their actions, which precludes disturbances. In this paper, we show that a modified version of this distributed learning algorithm can learn Pareto-efficient solutions, even in the presence of disturbances from a finite set. We apply our approach to the problem of ramp coordination in traffic control for different demand profiles.

OCApr 20, 2011
On mean-square boundedness of stochastic linear systems with quantized observations

Debasish Chatterjee, Peter Hokayem, Federico Ramponi et al.

We propose a procedure to design a state-quantizer with finitely many bins for a marginally stable stochastic linear system evolving in $\R^d$, and a bounded policy based on the resulting quantized state measurements to ensure bounded second moment in closed-loop.

SYMay 1, 2017
Computing the projected reachable set of switched affine systems: an application to systems biology

Francesca Parise, Maria Elena Valcher, John Lygeros

A fundamental question in systems biology is what combinations of mean and variance of the species present in a stochastic biochemical reaction network are attainable by perturbing the system with an external signal. To address this question, we show that the moments evolution in any generic network can be either approximated or, under suitable assumptions, computed exactly as the solution of a switched affine system. Motivated by this application, we propose a new method to approximate the reachable set of switched affine systems. A remarkable feature of our approach is that it allows one to easily compute projections of the reachable set for pairs of moments of interest, without requiring the computation of the full reachable set, which can be prohibitive for large networks. As a second contribution, we also show how to select the external signal in order to maximize the probability of reaching a target set. To illustrate the method we study a renown model of controlled gene expression and we derive estimates of the reachable set, for the protein mean and variance, that are more accurate than those available in the literature and consistent with experimental data.

18.6SYApr 20
Policy Optimization for Unknown Systems using Differentiable Model Predictive Control

Riccardo Zuliani, Efe C. Balta, John Lygeros

Model-based policy optimization often struggles with inaccurate system dynamics models, leading to suboptimal closed-loop performance. This challenge is especially evident in Model Predictive Control (MPC) policies, which rely on the model for real-time trajectory planning and optimization. We introduce a novel policy optimization framework for MPC-based policies combining differentiable optimization with zeroth-order optimization. Our method combines model-based and model-free gradient estimation approaches, achieving faster transient performance compared to fully data-driven approaches while maintaining convergence guarantees, even under model uncertainty. We demonstrate the effectiveness of the proposed approach on a nonlinear control task involving a 12-dimensional quadcopter model.

LGSep 26, 2024
Safe Time-Varying Optimization based on Gaussian Processes with Spatio-Temporal Kernel

Jialin Li, Marta Zagorowska, Giulia De Pasquale et al.

Ensuring safety is a key aspect in sequential decision making problems, such as robotics or process control. The complexity of the underlying systems often makes finding the optimal decision challenging, especially when the safety-critical system is time-varying. Overcoming the problem of optimizing an unknown time-varying reward subject to unknown time-varying safety constraints, we propose TVSafeOpt, a new algorithm built on Bayesian optimization with a spatio-temporal kernel. The algorithm is capable of safely tracking a time-varying safe region without the need for explicit change detection. Optimality guarantees are also provided for the algorithm when the optimization problem becomes stationary. We show that TVSafeOpt compares favorably against SafeOpt on synthetic data, both regarding safety and optimality. Evaluation on a realistic case study with gas compressors confirms that TVSafeOpt ensures safety when solving time-varying optimization problems with unknown reward and safety functions.

7.2AIMay 21
Deep Reinforcement Learning for Flexible Job Shop Scheduling with Random Job Arrivals

Yu Tang, Muhammad Zakwan, Efe Balta et al.

The Flexible Job Shop Scheduling Problem (FJSP) is the optimal allocation of a set of jobs to machines. Two primary challenges persist in FJSP: the unpredictable arrival of future jobs and the combinatorial complexity of the problem, rendering it intractable for conventional mixed-integer linear programming solvers. This paper proposes an event-based \gls{DRL} approach to solve FJSP with random job arrivals. Specifically, we employ the Proximal Policy Optimization algorithm and use lightweight Multi-Layer Perceptrons to train the \gls{DRL} agent for minimizing the total completion time of all jobs. We design the state representation to be directly accessible from the environment, and limit the learning agent to selecting from among a set of well-established dispatching rules. Simulations show that our \gls{DRL} approach outperforms any of the individual dispatching rules on datasets with varying heterogeneity and job arrival rates. We benchmark our \gls{DRL} against an arrival-triggered mixed-integer linear programming solution and show that our method achieves good performance especially when the datasets are heterogeneous.

16.7AIMay 21
Meta-Learning for Rapid Adaptation in Reference Tracking of Uncertain Nonlinear Systems

Jiaqi Yan, Ankush Chakrabarty, Niklas Schmid et al.

In this paper, we address the problem of reference tracking for uncertain nonlinear systems. Since collecting data from the target system (i.e., the system of interest) is often challenging, our objective is to design optimal controllers using limited target system data. Meta-learning provides a promising paradigm by leveraging offline data from source systems (systems sharing structural similarities with the target system) to accelerate training and enhance control performance. Motivated by this idea, we propose a meta-learning-based control framework that tailors the implicit model-agnostic meta-learning (iMAML) algorithm to the control setting. The framework operates in two phases: an (offline) meta-training phase, where an aggregated representation is learned from source data to capture the shared system dynamics among similar systems, and an (online) meta-adaptation phase, where this representation is fine-tuned on the target system using only a few data samples and limited adaptation steps. We formulate this framework as a bi-level optimization problem and provide an efficient solution with reduced storage complexity and few approximations. The proposed framework is general, allowing various learning algorithms to be integrated. To demonstrate this flexibility, we propose two specific learning algorithms that can be incorporated into our framework based on a neural state-space model and a deep Q-network, respectively. The primary distinction between these approaches is whether explicit system identification is required. Numerical simulations and hardware experiments demonstrate that the proposed methods enhance control performance and consistently outperform baseline approaches.

15.6SYApr 20
Scenario-Based Stochastic MPC for Energy Hubs with EV Fleets Under Persistent Grid Outages

Kobena Badu Enyam, Cara Koepele, Timothy Asare et al.

Emissions reduction and resilience to outages motivate the adoption of renewable microgrids. Surprisingly, research integrating both probabilistic grid outages and electric vehicle (EV) charging requirements remains limited. This paper addresses this gap by developing a scenario-based stochastic model predictive controller (SMPC) for a microgrid energy hub comprising solar generation, battery storage, diesel backup, and an EV fleet connected to a weak grid. Grid outage and campus load scenarios are generated from a continuous-time Markov chain and a Gaussian Process, respectively. Using 2023 operational data from the Ashesi University Energy Hub in Ghana, we demonstrate that the SMPC achieves performance within 1\% of a perfect-forecast benchmark. In contrast, a naive MPC that assumes continuous grid availability offers no economic or sustainability advantage over rule-based control, with both incurring significantly higher costs and emissions than the SMPC. These results highlight that outage anticipation is essential for economic viability. Finally, we show that incorporating a deterministic buffer against EV consumption uncertainty eliminates over 90\% of state-of-charge violations with negligible impact on total operating costs

SYSep 6, 2024
Online Residual Learning from Offline Experts for Pedestrian Tracking

Anastasios Vlachos, Anastasios Tsiamis, Aren Karapetyan et al.

In this paper, we consider the problem of predicting unknown targets from data. We propose Online Residual Learning (ORL), a method that combines online adaptation with offline-trained predictions. At a lower level, we employ multiple offline predictions generated before or at the beginning of the prediction horizon. We augment every offline prediction by learning their respective residual error concerning the true target state online, using the recursive least squares algorithm. At a higher level, we treat the augmented lower-level predictors as experts, adopting the Prediction with Expert Advice framework. We utilize an adaptive softmax weighting scheme to form an aggregate prediction and provide guarantees for ORL in terms of regret. We employ ORL to boost performance in the setting of online pedestrian trajectory prediction. Based on data from the Stanford Drone Dataset, we show that ORL can demonstrate best-of-both-worlds performance.

45.5SYMar 26
Policy Optimization with Differentiable MPC: Convergence Analysis under Uncertainty

Riccardo Zuliani, Efe C. Balta, John Lygeros

Model-based policy optimization is a well-established framework for designing reliable and high-performance controllers across a wide range of control applications. Recently, this approach has been extended to model predictive control policies, where explicit dynamical models are embedded within the control law. However, the performance of the resulting controllers, and the convergence of the associated optimization algorithms, critically depends on the accuracy of the models. In this paper, we demonstrate that combining gradient-based policy optimization with recursive system identification ensures convergence to an optimal controller design and showcase our finding in several control examples.

SYFeb 15, 2024Code
Predictive Linear Online Tracking for Unknown Targets

Anastasios Tsiamis, Aren Karapetyan, Yueshan Li et al.

In this paper, we study the problem of online tracking in linear control systems, where the objective is to follow a moving target. Unlike classical tracking control, the target is unknown, non-stationary, and its state is revealed sequentially, thus, fitting the framework of online non-stochastic control. We consider the case of quadratic costs and propose a new algorithm, called predictive linear online tracking (PLOT). The algorithm uses recursive least squares with exponential forgetting to learn a time-varying dynamic model of the target. The learned model is used in the optimal policy under the framework of receding horizon control. We show the dynamic regret of PLOT scales with $\mathcal{O}(\sqrt{TV_T})$, where $V_T$ is the total variation of the target dynamics and $T$ is the time horizon. Unlike prior work, our theoretical results hold for non-stationary targets. We implement PLOT on a real quadrotor and provide open-source software, thus, showcasing one of the first successful applications of online control methods on real hardware.

75.8SYMar 17
Data-driven generalized perimeter control: Zürich case study

Alessio Rimoldi, Carlo Cenedese, Alberto Padoan et al.

Urban traffic congestion is a key challenge for the development of modern cities, requiring advanced control techniques to optimize existing infrastructures usage. Despite the extensive availability of data, modeling such complex systems remains an expensive and time consuming step when designing model-based control approaches. On the other hand, machine learning approaches require simulations to bootstrap models, or are unable to deal with the sparse nature of traffic data and enforce hard constraints. We propose a novel formulation of traffic dynamics based on behavioral systems theory and apply data-enabled predictive control to steer traffic dynamics via dynamic traffic light control. A high-fidelity simulation of the city of Zürich, the largest closed-loop microscopic simulation of urban traffic in the literature to the best of our knowledge, is used to validate the performance of the proposed method in terms of total travel time and CO2 emissions.

92.1SYApr 27
The Fragility of Learning LQG Controllers

Bruce D. Lee, Anastasios Tsiamis, Nikolai Matni et al.

Learning methods are increasingly used to synthesize controllers from data, yet existing sample-complexity characterizations for continuous control are sharp only in the fully observed setting. This paper studies the partially observed case by deriving information-theoretic lower bounds for learning Linear Quadratic Gaussian (LQG) controllers from offline trajectories generated by a (linear) exploration policy. We prove an $\varepsilon$-local minimax excess-cost lower bound that applies to any algorithm mapping the offline dataset to a stabilizing linear controller. The bound is expressed in terms of the Hessian of the LQG cost with respect to model parameters and the inverse Fisher Information induced by the exploration policy. We further provide system-theoretic characterizations of these objects, enabling transparent construction of hard instances. Instantiating the bound on classical fragile robust-control examples, including variants of the Doyle LQG fragility counterexample and non-minimum-phase systems, demonstrates when fragile robust control problems translate into high sample complexity for learning-enabled control. These results suggest the asymptotic optimality of certainty-equivalent synthesis and motivate the importance of both task-directed experiment design and system co-design for sample-efficient learning in partially observed control.

SYApr 18, 2024
MPC of Uncertain Nonlinear Systems with Meta-Learning for Fast Adaptation of Neural Predictive Models

Jiaqi Yan, Ankush Chakrabarty, Alisa Rupenyan et al.

In this paper, we consider the problem of reference tracking in uncertain nonlinear systems. A neural State-Space Model (NSSM) is used to approximate the nonlinear system, where a deep encoder network learns the nonlinearity from data, and a state-space component captures the temporal relationship. This transforms the nonlinear system into a linear system in a latent space, enabling the application of model predictive control (MPC) to determine effective control actions. Our objective is to design the optimal controller using limited data from the \textit{target system} (the system of interest). To this end, we employ an implicit model-agnostic meta-learning (iMAML) framework that leverages information from \textit{source systems} (systems that share similarities with the target system) to expedite training in the target system and enhance its control performance. The framework consists of two phases: the (offine) meta-training phase learns a aggregated NSSM using data from source systems, and the (online) meta-inference phase quickly adapts this aggregated model to the target system using only a few data points and few online training iterations, based on local loss function gradients. The iMAML algorithm exploits the implicit function theorem to exactly compute the gradient during training, without relying on the entire optimization path. By focusing solely on the optimal solution, rather than the path, we can meta-train with less storage complexity and fewer approximations than other contemporary meta-learning algorithms. We demonstrate through numerical examples that our proposed method can yield accurate predictive models by adaptation, resulting in a downstream MPC that outperforms several baselines.

LGAug 14, 2025
Pinet: Optimizing hard-constrained neural networks with orthogonal projection layers

Panagiotis D. Grontas, Antonio Terpin, Efe C. Balta et al.

We introduce an output layer for neural networks that ensures satisfaction of convex constraints. Our approach, $Π$net, leverages operator splitting for rapid and reliable projections in the forward pass, and the implicit function theorem for backpropagation. We deploy $Π$net as a feasible-by-design optimization proxy for parametric constrained optimization problems and obtain modest-accuracy solutions faster than traditional solvers when solving a single problem, and significantly faster for a batch of problems. We surpass state-of-the-art learning approaches in terms of training time, solution quality, and robustness to hyperparameter tuning, while maintaining similar inference times. Finally, we tackle multi-vehicle motion planning with non-convex trajectory preferences and provide $Π$net as a GPU-ready package implemented in JAX with effective tuning heuristics.

SYApr 1, 2024
Finite Sample Frequency Domain Identification

Anastasios Tsiamis, Mohamed Abdalmoaty, Roy S. Smith et al.

We study non-parametric frequency-domain system identification from a finite-sample perspective. We assume an open loop scenario where the excitation input is periodic and consider the Empirical Transfer Function Estimate (ETFE), where the goal is to estimate the frequency response at certain desired (evenly-spaced) frequencies, given input-output samples. We show that under sub-Gaussian colored noise (in time-domain) and stability assumptions, the ETFE estimates are concentrated around the true values. The error rate is of the order of $\mathcal{O}((d_{\mathrm{u}}+\sqrt{d_{\mathrm{u}}d_{\mathrm{y}}})\sqrt{M/N_{\mathrm{tot}}})$, where $N_{\mathrm{tot}}$ is the total number of samples, $M$ is the number of desired frequencies, and $d_{\mathrm{u}},\,d_{\mathrm{y}}$ are the dimensions of the input and output signals respectively. This rate remains valid for general irrational transfer functions and does not require a finite order state-space representation. By tuning $M$, we obtain a $N_{\mathrm{tot}}^{-1/3}$ finite-sample rate for learning the frequency response over all frequencies in the $ \mathcal{H}_{\infty}$ norm. Our result draws upon an extension of the Hanson-Wright inequality to semi-infinite matrices. We study the finite-sample behavior of ETFE in simulations.

63.1OCApr 10
A Unified Control-Theoretic Framework for Saddle-Point Dynamics in Constrained Optimization

Veronica Centorrino, Rawan Hoteit, Efe C. Balta et al.

This paper studies equality-constrained minimization problems through the lens of feedback control. We introduce a unified control-theoretic framework by showing that a PID feedback law acting on the dual variable induces the PID saddle-point flow (PID-SPF), a broad class of saddle-point dynamics associated with the augmented Lagrangian. This framework recovers several classical primal-dual flows as special cases. We prove that the equilibria of the proposed flow coincide with the stationary points of the original problem. Our analysis reveals how the feedback gains affect the optimization: integral action enforces constraint satisfaction, proportional action introduces the augmented Lagrangian structure, and derivative action modifies the geometry of the primal dynamics by inducing a state-dependent Riemannian metric. Moreover, for convex problems with affine constraints, we establish global exponential convergence by leveraging contraction theory for all admissible PID gains, providing in the process explicit bounds on the convergence rate. Finally, we validate our theoretical results on numerical examples including an application to bilevel optimization.

LGMar 26, 2025
Wasserstein Distributionally Robust Bayesian Optimization with Continuous Context

Francesco Micheli, Efe C. Balta, Anastasios Tsiamis et al.

We address the challenge of sequential data-driven decision-making under context distributional uncertainty. This problem arises in numerous real-world scenarios where the learner optimizes black-box objective functions in the presence of uncontrollable contextual variables. We consider the setting where the context distribution is uncertain but known to lie within an ambiguity set defined as a ball in the Wasserstein distance. We propose a novel algorithm for Wasserstein Distributionally Robust Bayesian Optimization that can handle continuous context distributions while maintaining computational tractability. Our theoretical analysis combines recent results in self-normalized concentration in Hilbert spaces and finite-sample bounds for distributionally robust optimization to establish sublinear regret bounds that match state-of-the-art results. Through extensive comparisons with existing approaches on both synthetic and real-world problems, we demonstrate the simplicity, effectiveness, and practical applicability of our proposed method.

OCOct 18, 2024
Contractivity and linear convergence in bilinear saddle-point problems: An operator-theoretic approach

Colin Dirren, Mattia Bianchi, Panagiotis D. Grontas et al.

We study the convex-concave bilinear saddle-point problem $\min_x \max_y f(x) + y^\top Ax - g(y)$, where both, only one, or none of the functions $f$ and $g$ are strongly convex, and suitable rank conditions on the matrix $A$ hold. The solution of this problem is at the core of many machine learning tasks. By employing tools from monotone operator theory, we systematically prove the contractivity (in turn, the linear convergence) of several first-order primal-dual algorithms, including the Chambolle-Pock method. Our approach results in concise proofs, and it yields new convergence guarantees and tighter bounds compared to known results.