John Lygeros

h-index76

58papers

1,816citations

Novelty46%

AI Score54

Ranked #11,375 of 194,257 authors (top 6%)#40 in SY (top 2%)

58 Papers

1.2SYMay 8, 2017

Aggregation and Disaggregation of Energetic Flexibility from Distributed Energy Resources

Fabian L. Müller, Jácint Szabó, Olle Sundström et al.

A variety of energy resources has been identified as being flexible in their electric energy consumption or generation. This energetic flexibility can be used for various purposes such as minimizing energy procurement costs or providing ancillary services to power grids. To fully leverage the flexibility available from distributed small-scale resources, their flexibility must be quantified and aggregated. This paper introduces a generic and scalable approach for flexible energy systems to quantitatively describe and price their flexibility based on zonotopic sets. The description proposed allows aggregators to efficiently pool the flexibility of large numbers of systems and to make control and market decisions on the aggregate level. In addition, an algorithm is presented that distributes aggregate-level control decisions among the individual systems of the pool in an economically fair and computationally efficient way. Finally, it is shown how the zonotopic description of flexibility enables an efficient computation of aggregate regulation power bid-curves.

17.6OCFeb 15, 2013

Symbolic control of stochastic systems via approximately bisimilar finite abstractions

Majid Zamani, Peyman Mohajerin Esfahani, Rupak Majumdar et al.

Symbolic approaches to the control design over complex systems employ the construction of finite-state models that are related to the original control systems, then use techniques from finite-state synthesis to compute controllers satisfying specifications given in a temporal logic, and finally translate the synthesized schemes back as controllers for the concrete complex systems. Such approaches have been successfully developed and implemented for the synthesis of controllers over non-probabilistic control systems. In this paper, we extend the technique to probabilistic control systems modeled by controlled stochastic differential equations. We show that for every stochastic control system satisfying a probabilistic variant of incremental input-to-state stability, and for every given precision $\varepsilon>0$, a finite-state transition system can be constructed, which is $\varepsilon$-approximately bisimilar (in the sense of moments) to the original stochastic control system. Moreover, we provide results relating stochastic control systems to their corresponding finite-state transition systems in terms of probabilistic bisimulation relations known in the literature. We demonstrate the effectiveness of the construction by synthesizing controllers for stochastic control systems over rich specifications expressed in linear temporal logic. The discussed technique enables a new, automated, correct-by-construction controller synthesis approach for stochastic control systems, which are common mathematical models employed in many safety critical systems subject to structured uncertainty and are thus relevant for cyber-physical applications.

3.3OCJun 7, 2016

Robust Optimal Control with Adjustable Uncertainty Sets

Xiaojing Zhang, Maryam Kamgarpour, Angelos Georghiou et al.

In this paper, we develop a unified framework for studying constrained robust optimal control problems with adjustable uncertainty sets. In contrast to standard constrained robust optimal control problems with known uncertainty sets, we treat the uncertainty sets in our problems as additional decision variables. In particular, given a finite prediction horizon and a metric for adjusting the uncertainty sets, we address the question of determining the optimal size and shape of the uncertainty sets, while simultaneously ensuring the existence of a control policy that will keep the system within its constraints for all possible disturbance realizations inside the adjusted uncertainty set. Since our problem subsumes the classical constrained robust optimal control design problem, it is computationally intractable in general. We demonstrate that by restricting the families of admissible uncertainty sets and control policies, the problem can be formulated as a tractable convex optimization problem. We show that our framework captures several families of (convex) uncertainty sets of practical interest, and illustrate our approach on a demand response problem of providing control reserves for a power system.

4.6OCDec 6, 2012

Distributed Model Predictive Consensus via the Alternating Direction Method of Multipliers

Tyler H. Summers, John Lygeros

We propose a distributed optimization method for solving a distributed model predictive consensus problem. The goal is to design a distributed controller for a network of dynamical systems to optimize a coupled objective function while respecting state and input constraints. The distributed optimization method is an augmented Lagrangian method called the Alternating Direction Method of Multipliers (ADMM), which was introduced in the 1970s but has seen a recent resurgence in the context of dramatic increases in computing power and the development of widely available distributed computing platforms. The method is applied to position and velocity consensus in a network of double integrators. We find that a few tens of ADMM iterations yield closed-loop performance near what is achieved by solving the optimization problem centrally. Furthermore, the use of recent code generation techniques for solving local subproblems yields fast overall computation times.

5.2OCJul 19, 2016

The Power of Diversity: Data-Driven Robust Predictive Control for Energy Efficient Buildings and Districts

Georgios Darivianakis, Angelos Georghiou, Roy S. Smith et al.

The cooperative energy management of aggregated buildings has recently received a great deal of interest due to substantial potential energy savings. These gains are mainly obtained in two ways: (i) Exploiting the load shifting capabilities of the cooperative buildings; (ii) Utilizing the expensive but energy efficient equipment that is commonly shared by the building community (e.g., heat pumps, batteries and photovoltaics). Several deterministic and stochastic control schemes that strive to realize these savings, have been proposed in the literature. A common difficulty with all these methods is integrating knowledge about the disturbances affecting the system. In this context, the underlying disturbance distributions are often poorly characterized based on historical data. In this paper, we address this issue by exploiting the historical data to construct families of distributions which contain these underlying distributions with high confidence. We then employ tools from data-driven robust optimization to formulate a multistage stochastic optimization problem which can be approximated by a finite-dimensional linear program. The proposed method is suitable for tackling large scale systems since its complexity grows polynomially with respect to the system variables. We demonstrate its efficacy in a numerical study, in which it is shown to outperform, in terms of energy cost savings and constraint violations, established solution techniques from the literature. We conclude this study by showing the significant energy gains that are obtained by cooperatively managing a collection of buildings with heterogeneous characteristics.

2.6OCFeb 20, 2017

From Infinite to Finite Programs: Explicit Error Bounds with Applications to Approximate Dynamic Programming

Peyman Mohajerin Esfahani, Tobias Sutter, Daniel Kuhn et al.

We consider linear programming (LP) problems in infinite dimensional spaces that are in general computationally intractable. Under suitable assumptions, we develop an approximation bridge from the infinite-dimensional LP to tractable finite convex programs in which the performance of the approximation is quantified explicitly. To this end, we adopt the recent developments in two areas of randomized optimization and first order methods, leading to a priori as well as a posterior performance guarantees. We illustrate the generality and implications of our theoretical results in the special case of the long-run average cost and discounted cost optimal control problems for Markov decision processes on Borel spaces. The applicability of the theoretical results is demonstrated through a constrained linear quadratic optimal control problem and a fisheries management problem.

4.6OCDec 6, 2012

Approximate Dynamic Programming via Sum of Squares Programming

Tyler H. Summers, Konstantin Kunz, Nikolaos Kariotoglou et al.

We describe an approximate dynamic programming method for stochastic control problems on infinite state and input spaces. The optimal value function is approximated by a linear combination of basis functions with coefficients as decision variables. By relaxing the Bellman equation to an inequality, one obtains a linear program in the basis coefficients with an infinite set of constraints. We show that a recently introduced method, which obtains convex quadratic value function approximations, can be extended to higher order polynomial approximations via sum of squares programming techniques. An approximate value function can then be computed offline by solving a semidefinite program, without having to sample the infinite constraint. The policy is evaluated online by solving a polynomial optimization problem, which also turns out to be convex in some cases. We experimentally validate the method on an autonomous helicopter testbed using a 10-dimensional helicopter model.

2.4OCMar 20, 2018

Distributed Model Predictive Control for Linear Systems with Adaptive Terminal Sets

Georgios Darivianakis, Annika Eichler, John Lygeros

In this paper, we propose a distributed model predictive control (DMPC) scheme for linear time-invariant constrained systems which admit a separable structure. To exploit the merits of distributed computation algorithms, the stabilizing terminal controller, value function and invariant terminal set of the DMPC optimization problem need to respect the loosely coupled structure of the system. Although existing methods in the literature address this task, they typically decouple the synthesis of terminal controllers and value functions from the one of terminal sets. In addition, these approaches do not explicitly consider the effect of the current state of the system in the synthesis process. These limitations can lead the resulting DMPC scheme to poor performance since it may admit small or even empty terminal sets. Unlike other approaches, this paper presents a unified framework to encapsulate the synthesis of both the stabilizing terminal controller and invariant terminal set into the DMPC formulation. Conditions for Lyapunov stability and invariance are imposed in the synthesis problem in a way that allows the value function and invariant terminal set to admit the desired distributed structure. We illustrate the effectiveness of the proposed method on several examples including a benchmark spring-mass-damper problem.

1.2SYJun 20, 2018

Unlocking the Potential of Flexible Energy Resources to Help Balance the Power Grid

Fabian L. Müller, Stefan Woerner, John Lygeros

Flexible energy resources can help balance the power grid by providing different types of ancillary services. However, the balancing potential of most types of resources is restricted by physical constraints such as the size of their energy buffer, limits on power-ramp rates, or control delays. Using the example of Secondary Frequency Regulation, this paper shows how the flexibility of various resources can be exploited more efficiently by considering multiple resources with complementary physical properties and controlling them in a coordinated way. To this end, optimal adjustable control policies are computed based on robust optimization. Our problem formulation takes into account power ramp-rate constraints explicitly, and accurately models the different timescales and lead times of the energy and reserve markets. Simulations demonstrate that aggregations of select resources can offer significantly more regulation capacity than the resources could provide individually.

1.2SYOct 17, 2017

A distributed algorithm for average aggregative games with coupling constraints

Francesca Parise, Basilio Gentile, John Lygeros

We consider the framework of average aggregative games, where the cost function of each agent depends on his own strategy and on the average population strategy. We focus on the case in which the agents are coupled not only via their cost functions, but also via constraints coupling their strategies. We propose a distributed algorithm that achieves an almost Nash equilibrium by requiring only local communications of the agents, as specified by a sparse communication network. The proof of convergence of the algorithm relies on the auxiliary class of network aggregative games and exploits a novel result of parametric convergence of variational inequalities, which is applicable beyond the context of games. We apply our theoretical findings to a multi-market Cournot game with transportation costs and maximum market capacity.

1.8LGMay 24, 2022

Advanced Manufacturing Configuration by Sample-efficient Batch Bayesian Optimization

Xavier Guidetti, Alisa Rupenyan, Lutz Fassl et al.

We propose a framework for the configuration and operation of expensive-to-evaluate advanced manufacturing methods, based on Bayesian optimization. The framework unifies a tailored acquisition function, a parallel acquisition procedure, and the integration of process information providing context to the optimization procedure. \cmtb{The novel acquisition function is demonstrated, analyzed and compared on state-of-the-art benchmarking problems. We apply the optimization approach to atmospheric plasma spraying and fused deposition modeling.} Our results demonstrate that the proposed framework can efficiently find input parameters that produce the desired outcome and minimize the process cost.

3.3OCOct 9, 2016

Strong Stationarity Conditions for Optimal Control of Hybrid Systems

Andreas B. Hempel, Paul Goulart, John Lygeros

We present necessary and sufficient optimality conditions for finite time optimal control problems for a class of hybrid systems described by linear complementarity models. Although these optimal control problems are difficult in general due to the presence of complementarity constraints, we provide a set of structural assumptions ensuring that the tangent cone of the constraints possesses geometric regularity properties. These imply that the classical Karush-Kuhn-Tucker conditions of nonlinear programming theory are both necessary and sufficient for local optimality, which is not the case for general mathematical programs with complementarity constraints. We also present sufficient conditions for global optimality. We proceed to show that the dynamics of every continuous piecewise affine system can be written as the optimizer of a mathematical program which results in a linear complementarity model satisfying our structural assumptions. Hence, our stationarity results apply to a large class of hybrid systems with piecewise affine dynamics. We present simulation results showing the substantial benefits possible from using a nonlinear programming approach to the optimal control problem with complementarity constraints instead of a more traditional mixed-integer formulation.

3.3SYNov 14, 2022

Implications of Regret on Stability of Linear Dynamical Systems

Aren Karapetyan, Anastasios Tsiamis, Efe C. Balta et al.

The setting of an agent making decisions under uncertainty and under dynamic constraints is common for the fields of optimal control, reinforcement learning, and recently also for online learning. In the online learning setting, the quality of an agent's decision is often quantified by the concept of regret, comparing the performance of the chosen decisions to the best possible ones in hindsight. While regret is a useful performance measure, when dynamical systems are concerned, it is important to also assess the stability of the closed-loop system for a chosen policy. In this work, we show that for linear state feedback policies and linear systems subject to adversarial disturbances, linear regret implies asymptotic stability in both time-varying and time-invariant settings. Conversely, we also show that bounded input bounded state stability and summability of the state transition matrices imply linear regret.

2.3SYApr 24, 2023

Stochastic MPC for energy hubs using data driven demand forecasting

Varsha Behrunani, Francesco Micheli, Jonas Mehr et al.

Energy hubs convert and distribute energy resources by combining different energy inputs through multiple conversion and storage components. The optimal operation of the energy hub exploits its flexibility to increase the energy efficiency and reduce the operational costs. However, uncertainties in the demand present challenges to energy hub optimization. In this paper, we propose a stochastic MPC controller to minimize energy costs using chance constraints for the uncertain electricity and thermal demands. Historical data is used to build a demand prediction model based on Gaussian processes to generate a forecast of the future electricity and heat demands. The stochastic optimization problem is solved via the Scenario Approach by sampling multi-step demand trajectories from the derived prediction model. The performance of the proposed predictor and of the stochastic controller is verified on a simulated energy hub model and demand data from a real building.

2.4OCJun 27, 2021

Decentralized decision making for networks of uncertain systems

Georgios Darivianakis, Angelos Georghiou, John Lygeros

Distributed model predictive control (MPC) has been proven a successful method in regulating the operation of large-scale networks of constrained dynamical systems. This paper is concerned with cooperative distributed MPC in which the decision actions of the systems are usually derived by the solution of a system-wide optimization problem. However, formulating and solving such large-scale optimization problems is often a hard task which requires extensive information communication among the individual systems and fails to address privacy concerns in the network. Hence, the main challenge is to design decision policies with a prescribed structure so that the resulting system-wide optimization problem to admit a loosely coupled structure and be amendable to distributed computation algorithms. In this paper, we propose a decentralized problem synthesis scheme which only requires each system to communicate sets which bound its states evolution to neighboring systems. The proposed method alleviates concerns on privacy since this limited communication scheme does not reveal the exact characteristics of the dynamics within each system. In addition, it enables a distributed computation of the solution, making our method highly scalable. We demonstrate in a number of numerical studies, inspired by engineering and finance, the efficacy of the proposed approach which leads to solutions that closely approximate those obtained by the centralized formulation only at a fraction of the computational effort.

6.1OCApr 24, 2023

Designing Optimal Personalized Incentive for Traffic Routing using BIG Hype algorithm

Panagiotis D. Grontas, Carlo Cenedese, Marta Fochesato et al.

We study the problem of optimally routing plug-in electric and conventional fuel vehicles on a city level. In our model, commuters selfishly aim to minimize a local cost that combines travel time, from a fixed origin to a desired destination, and the monetary cost of using city facilities, parking or service stations. The traffic authority can influence the commuters' preferred routing choice by means of personalized discounts on parking tickets and on the energy price at service stations. We formalize the problem of designing these monetary incentives optimally as a large-scale bilevel game, where constraints arise at both levels due to the finite capacities of city facilities and incentives budget. Then, we develop an efficient decentralized solution scheme with convergence guarantees based on BIG Hype, a recently-proposed hypergradient-based algorithm for hierarchical games. Finally, we validate our model via numerical simulations over the Anaheim's network, and show that the proposed approach produces sensible results in terms of traffic decongestion and it is able to solve in minutes problems with more than 48000 variables and 110000 constraints.

1.2SYNov 18, 2022

Optimal service station design for traffic mitigation via genetic algorithm and neural network

Carlo Cenedese, Michele Cucuzzella, Adriano Cotta Ramusino et al.

This paper analyzes how the presence of service stations on highways affects traffic congestion. We focus on the problem of optimally designing a service station to achieve beneficial effects in terms of total traffic congestion and peak traffic reduction. Microsimulators cannot be used for this task due to their computational inefficiency. We propose a genetic algorithm based on the recently proposed CTMs, that efficiently describes the dynamics of a service station. Then, we leverage the algorithm to train a neural network capable of solving the same problem, avoiding implementing the CTMs. Finally, we examine two case studies to validate the capabilities and performance of our algorithms. In these simulations, we use real data extracted from Dutch highways.

1.2SYNov 14, 2022Code

Follow the Clairvoyant: an Imitation Learning Approach to Optimal Control

Andrea Martin, Luca Furieri, Florian Dörfler et al.

We consider control of dynamical systems through the lens of competitive analysis. Most prior work in this area focuses on minimizing regret, that is, the loss relative to an ideal clairvoyant policy that has noncausal access to past, present, and future disturbances. Motivated by the observation that the optimal cost only provides coarse information about the ideal closed-loop behavior, we instead propose directly minimizing the tracking error relative to the optimal trajectories in hindsight, i.e., imitating the clairvoyant policy. By embracing a system level perspective, we present an efficient optimization-based approach for computing follow-the-clairvoyant (FTC) safe controllers. We prove that these attain minimal regret if no constraints are imposed on the noncausal benchmark. In addition, we present numerical experiments to show that our policy retains the hallmark of competitive algorithms of interpolating between classical $\mathcal{H}_2$ and $\mathcal{H}_\infty$ control laws - while consistently outperforming regret minimization methods in constrained scenarios thanks to the superior ability to chase the clairvoyant.

2.3SYApr 10, 2022

Regret Analysis of Online Gradient Descent-based Iterative Learning Control with Model Mismatch

Efe C. Balta, Andrea Iannelli, Roy S. Smith et al.

In Iterative Learning Control (ILC), a sequence of feedforward control actions is generated at each iteration on the basis of partial model knowledge and past measurements with the goal of steering the system toward a desired reference trajectory. This is framed here as an online learning task, where the decision-maker takes sequential decisions by solving a sequence of optimization problems having only partial knowledge of the cost functions. Having established this connection, the performance of an online gradient-descent based scheme using inexact gradient information is analyzed in the setting of dynamic and static regret, standard measures in online learning. Fundamental limitations of the scheme and its integration with adaptation mechanisms are further investigated, followed by numerical simulations on a benchmark ILC problem.

1.2SYFeb 18, 2019

Nonlinear Control of Quadcopters via Approximate Dynamic Programming

Angel Romero, Paul N. Beuchat, Yvonne R. Stürz et al.

While Approximate Dynamic Programming has successfully been used in many applications involving discrete states and inputs such as playing the games of Tetris or chess, it has not been used in many continuous state and input space applications. In this paper, we combine Approximate Dynamic Programming techniques and apply them to the continuous, non-linear and high dimensional dynamics of a quadcopter vehicle. We use a polynomial approximation of the dynamics and sum-of-squares programming techniques to compute a family of polynomial value function approximations for different tuning parameters. The resulting approximations to the optimal value function are combined in a point-wise maximum approach, which is used to compute the online policy. The success of the method is demonstrated in both simulations and experiments on a quadcopter. The control performance is compared to a linear time-varying Model Predictive Controller. The two methods are then combined to keep the computational benefits of a short horizon MPC and the long term performance benefits of the Approximate Dynamic Programming value function as the terminal cost.

1.2SYAug 30, 2018

Performance guarantees for model-based Approximate Dynamic Programming in continuous spaces

Paul N. Beuchat, Angelos Georghiou, John Lygeros

We study both the value function and Q-function formulation of the Linear Programming approach to Approximate Dynamic Programming. The approach is model-based and optimizes over a restricted function space to approximate the value function or Q-function. Working in the discrete time, continuous space setting, we provide guarantees for the fitting error and online performance of the policy. In particular, the online performance guarantee is obtained by analyzing an iterated version of the greedy policy, and the fitting error guarantee by analyzing an iterated version of the Bellman inequality. These guarantees complement the existing bounds that appear in the literature. The Q-function formulation offers benefits, for example, in decentralized controller design, however it can lead to computationally demanding optimization problems. To alleviate this drawback, we provide a condition that simplifies the formulation, resulting in improved computational times.

1.2SYApr 11, 2018

Ramp-Rate-Constrained Bidding of Energy and Frequency Reserves in Real Market Settings

Fabian L. Müller, Stefan Woerner, John Lygeros

The energetic flexibility of electric energy resources can be exploited when trading on wholesale energy and ancillary service markets. This paper considers the problem of a Balance Responsible Party to maximize its profit from trading on energy markets while simultaneously offering Secondary Frequency Reserves to the System Operator. However, the accurate provision of regulation power can be compromised by power ramp-rate limitations of the resources providing the service. To avoid this shortcoming, we take power ramp-rate constraints into account explicitly when computing optimal adjustable energy trading policies that are robust against uncertain activation of reserves. The approach proposed is applicable in real market settings because it models all the different timescales of the day-ahead, intra-day, and reserve markets. Finally, the effect of different market settings and energy trading policies on the amount of available Secondary Frequency Reserves is investigated.

2.4OCOct 6, 2017

The Linear Programming Approach to Reach-Avoid Problems for Markov Decision Processes

Nikolaos Kariotoglou, Maryam Kamgarpour, Tyler Summers et al.

One of the most fundamental problems in Markov decision processes is analysis and control synthesis for safety and reachability specifications. We consider the stochastic reach-avoid problem, in which the objective is to synthesize a control policy to maximize the probability of reaching a target set at a given time, while staying in a safe set at all prior times. We characterize the solution to this problem through an infinite dimensional linear program. We then develop a tractable approximation to the infinite dimensional linear program through finite dimensional approximations of the decision space and constraints. For a large class of Markov decision processes modeled by Gaussian mixtures kernels we show that through a proper selection of the finite dimensional space, one can further reduce the computational complexity of the resulting linear program. We validate the proposed method and analyze its potential with a series of numerical case studies.

1.2SYMar 30, 2016

Distributed Learning in the Presence of Disturbances

Chithrupa Ramesh, Marius Schmitt, John Lygeros

We consider a problem where multiple agents must learn an action profile that maximises the sum of their utilities in a distributed manner. The agents are assumed to have no knowledge of either the utility functions or the actions and payoffs of other agents. These assumptions arise when modelling the interactions in a complex system and communicating between various components of the system are both difficult. In [1], a distributed algorithm was proposed, which learnt Pareto-efficient solutions in this problem setting. However, the approach assumes that all agents can choose their actions, which precludes disturbances. In this paper, we show that a modified version of this distributed learning algorithm can learn Pareto-efficient solutions, even in the presence of disturbances from a finite set. We apply our approach to the problem of ramp coordination in traffic control for different demand profiles.

2.5OCApr 20, 2011

On mean-square boundedness of stochastic linear systems with quantized observations

Debasish Chatterjee, Peter Hokayem, Federico Ramponi et al.

We propose a procedure to design a state-quantizer with finitely many bins for a marginally stable stochastic linear system evolving in $\R^d$, and a bounded policy based on the resulting quantized state measurements to ensure bounded second moment in closed-loop.

5.2AIMay 21

Deep Reinforcement Learning for Flexible Job Shop Scheduling with Random Job Arrivals

Yu Tang, Muhammad Zakwan, Efe Balta et al.

The Flexible Job Shop Scheduling Problem (FJSP) is the optimal allocation of a set of jobs to machines. Two primary challenges persist in FJSP: the unpredictable arrival of future jobs and the combinatorial complexity of the problem, rendering it intractable for conventional mixed-integer linear programming solvers. This paper proposes an event-based \gls{DRL} approach to solve FJSP with random job arrivals. Specifically, we employ the Proximal Policy Optimization algorithm and use lightweight Multi-Layer Perceptrons to train the \gls{DRL} agent for minimizing the total completion time of all jobs. We design the state representation to be directly accessible from the environment, and limit the learning agent to selecting from among a set of well-established dispatching rules. Simulations show that our \gls{DRL} approach outperforms any of the individual dispatching rules on datasets with varying heterogeneity and job arrival rates. We benchmark our \gls{DRL} against an arrival-triggered mixed-integer linear programming solution and show that our method achieves good performance especially when the datasets are heterogeneous.

7.2SYMar 26

Policy Optimization with Differentiable MPC: Convergence Analysis under Uncertainty

Riccardo Zuliani, Efe C. Balta, John Lygeros

Model-based policy optimization is a well-established framework for designing reliable and high-performance controllers across a wide range of control applications. Recently, this approach has been extended to model predictive control policies, where explicit dynamical models are embedded within the control law. However, the performance of the resulting controllers, and the convergence of the associated optimization algorithms, critically depends on the accuracy of the models. In this paper, we demonstrate that combining gradient-based policy optimization with recursive system identification ensures convergence to an optimal controller design and showcase our finding in several control examples.

6.2SYApr 20

Scenario-Based Stochastic MPC for Energy Hubs with EV Fleets Under Persistent Grid Outages

Kobena Badu Enyam, Cara Koepele, Timothy Asare et al.

Emissions reduction and resilience to outages motivate the adoption of renewable microgrids. Surprisingly, research integrating both probabilistic grid outages and electric vehicle (EV) charging requirements remains limited. This paper addresses this gap by developing a scenario-based stochastic model predictive controller (SMPC) for a microgrid energy hub comprising solar generation, battery storage, diesel backup, and an EV fleet connected to a weak grid. Grid outage and campus load scenarios are generated from a continuous-time Markov chain and a Gaussian Process, respectively. Using 2023 operational data from the Ashesi University Energy Hub in Ghana, we demonstrate that the SMPC achieves performance within 1\% of a perfect-forecast benchmark. In contrast, a naive MPC that assumes continuous grid availability offers no economic or sustainability advantage over rule-based control, with both incurring significantly higher costs and emissions than the SMPC. These results highlight that outage anticipation is essential for economic viability. Finally, we show that incorporating a deterministic buffer against EV consumption uncertainty eliminates over 90\% of state-of-charge violations with negligible impact on total operating costs

6.8SYApr 20

Policy Optimization for Unknown Systems using Differentiable Model Predictive Control

Riccardo Zuliani, Efe C. Balta, John Lygeros

Model-based policy optimization often struggles with inaccurate system dynamics models, leading to suboptimal closed-loop performance. This challenge is especially evident in Model Predictive Control (MPC) policies, which rely on the model for real-time trajectory planning and optimization. We introduce a novel policy optimization framework for MPC-based policies combining differentiable optimization with zeroth-order optimization. Our method combines model-based and model-free gradient estimation approaches, achieving faster transient performance compared to fully data-driven approaches while maintaining convergence guarantees, even under model uncertainty. We demonstrate the effectiveness of the proposed approach on a nonlinear control task involving a 12-dimensional quadcopter model.

1.2SYMay 1, 2017

Computing the projected reachable set of switched affine systems: an application to systems biology

Francesca Parise, Maria Elena Valcher, John Lygeros

A fundamental question in systems biology is what combinations of mean and variance of the species present in a stochastic biochemical reaction network are attainable by perturbing the system with an external signal. To address this question, we show that the moments evolution in any generic network can be either approximated or, under suitable assumptions, computed exactly as the solution of a switched affine system. Motivated by this application, we propose a new method to approximate the reachable set of switched affine systems. A remarkable feature of our approach is that it allows one to easily compute projections of the reachable set for pairs of moments of interest, without requiring the computation of the full reachable set, which can be prohibitive for large networks. As a second contribution, we also show how to select the external signal in order to maximize the probability of reaching a target set. To illustrate the method we study a renown model of controlled gene expression and we derive estimates of the reachable set, for the protein mean and variance, that are more accurate than those available in the literature and consistent with experimental data.

10.4LGSep 26, 2024

Safe Time-Varying Optimization based on Gaussian Processes with Spatio-Temporal Kernel

Jialin Li, Marta Zagorowska, Giulia De Pasquale et al.

Ensuring safety is a key aspect in sequential decision making problems, such as robotics or process control. The complexity of the underlying systems often makes finding the optimal decision challenging, especially when the safety-critical system is time-varying. Overcoming the problem of optimizing an unknown time-varying reward subject to unknown time-varying safety constraints, we propose TVSafeOpt, a new algorithm built on Bayesian optimization with a spatio-temporal kernel. The algorithm is capable of safely tracking a time-varying safe region without the need for explicit change detection. Optimality guarantees are also provided for the algorithm when the optimization problem becomes stationary. We show that TVSafeOpt compares favorably against SafeOpt on synthetic data, both regarding safety and optimality. Evaluation on a realistic case study with gas compressors confirms that TVSafeOpt ensures safety when solving time-varying optimization problems with unknown reward and safety functions.

1.2SYSep 6, 2024

Online Residual Learning from Offline Experts for Pedestrian Tracking

Anastasios Vlachos, Anastasios Tsiamis, Aren Karapetyan et al.

In this paper, we consider the problem of predicting unknown targets from data. We propose Online Residual Learning (ORL), a method that combines online adaptation with offline-trained predictions. At a lower level, we employ multiple offline predictions generated before or at the beginning of the prediction horizon. We augment every offline prediction by learning their respective residual error concerning the true target state online, using the recursive least squares algorithm. At a higher level, we treat the augmented lower-level predictors as experts, adopting the Prediction with Expert Advice framework. We utilize an adaptive softmax weighting scheme to form an aggregate prediction and provide guarantees for ORL in terms of regret. We employ ORL to boost performance in the setting of online pedestrian trajectory prediction. Based on data from the Stanford Drone Dataset, we show that ORL can demonstrate best-of-both-worlds performance.

5.1SYFeb 15, 2024Code

Predictive Linear Online Tracking for Unknown Targets

Anastasios Tsiamis, Aren Karapetyan, Yueshan Li et al.

In this paper, we study the problem of online tracking in linear control systems, where the objective is to follow a moving target. Unlike classical tracking control, the target is unknown, non-stationary, and its state is revealed sequentially, thus, fitting the framework of online non-stochastic control. We consider the case of quadratic costs and propose a new algorithm, called predictive linear online tracking (PLOT). The algorithm uses recursive least squares with exponential forgetting to learn a time-varying dynamic model of the target. The learned model is used in the optimal policy under the framework of receding horizon control. We show the dynamic regret of PLOT scales with $\mathcal{O}(\sqrt{TV_T})$, where $V_T$ is the total variation of the target dynamics and $T$ is the time horizon. Unlike prior work, our theoretical results hold for non-stationary targets. We implement PLOT on a real quadrotor and provide open-source software, thus, showcasing one of the first successful applications of online control methods on real hardware.

5.1SYApr 18, 2024

MPC of Uncertain Nonlinear Systems with Meta-Learning for Fast Adaptation of Neural Predictive Models

Jiaqi Yan, Ankush Chakrabarty, Alisa Rupenyan et al.

In this paper, we consider the problem of reference tracking in uncertain nonlinear systems. A neural State-Space Model (NSSM) is used to approximate the nonlinear system, where a deep encoder network learns the nonlinearity from data, and a state-space component captures the temporal relationship. This transforms the nonlinear system into a linear system in a latent space, enabling the application of model predictive control (MPC) to determine effective control actions. Our objective is to design the optimal controller using limited data from the \textit{target system} (the system of interest). To this end, we employ an implicit model-agnostic meta-learning (iMAML) framework that leverages information from \textit{source systems} (systems that share similarities with the target system) to expedite training in the target system and enhance its control performance. The framework consists of two phases: the (offine) meta-training phase learns a aggregated NSSM using data from source systems, and the (online) meta-inference phase quickly adapts this aggregated model to the target system using only a few data points and few online training iterations, based on local loss function gradients. The iMAML algorithm exploits the implicit function theorem to exactly compute the gradient during training, without relying on the entire optimization path. By focusing solely on the optimal solution, rather than the path, we can meta-train with less storage complexity and fewer approximations than other contemporary meta-learning algorithms. We demonstrate through numerical examples that our proposed method can yield accurate predictive models by adaptation, resulting in a downstream MPC that outperforms several baselines.

2.3SYApr 1, 2024

Finite Sample Frequency Domain Identification

Anastasios Tsiamis, Mohamed Abdalmoaty, Roy S. Smith et al.

We study non-parametric frequency-domain system identification from a finite-sample perspective. We assume an open loop scenario where the excitation input is periodic and consider the Empirical Transfer Function Estimate (ETFE), where the goal is to estimate the frequency response at certain desired (evenly-spaced) frequencies, given input-output samples. We show that under sub-Gaussian colored noise (in time-domain) and stability assumptions, the ETFE estimates are concentrated around the true values. The error rate is of the order of $\mathcal{O}((d_{\mathrm{u}}+\sqrt{d_{\mathrm{u}}d_{\mathrm{y}}})\sqrt{M/N_{\mathrm{tot}}})$, where $N_{\mathrm{tot}}$ is the total number of samples, $M$ is the number of desired frequencies, and $d_{\mathrm{u}},\,d_{\mathrm{y}}$ are the dimensions of the input and output signals respectively. This rate remains valid for general irrational transfer functions and does not require a finite order state-space representation. By tuning $M$, we obtain a $N_{\mathrm{tot}}^{-1/3}$ finite-sample rate for learning the frequency response over all frequencies in the $ \mathcal{H}_{\infty}$ norm. Our result draws upon an extension of the Hanson-Wright inequality to semi-infinite matrices. We study the finite-sample behavior of ETFE in simulations.

5.6OCOct 18, 2024

Contractivity and linear convergence in bilinear saddle-point problems: An operator-theoretic approach

Colin Dirren, Mattia Bianchi, Panagiotis D. Grontas et al.

We study the convex-concave bilinear saddle-point problem $\min_x \max_y f(x) + y^\top Ax - g(y)$, where both, only one, or none of the functions $f$ and $g$ are strongly convex, and suitable rank conditions on the matrix $A$ hold. The solution of this problem is at the core of many machine learning tasks. By employing tools from monotone operator theory, we systematically prove the contractivity (in turn, the linear convergence) of several first-order primal-dual algorithms, including the Chambolle-Pock method. Our approach results in concise proofs, and it yields new convergence guarantees and tighter bounds compared to known results.

3.2OCMay 24, 2024Code

Randomized algorithms and PAC bounds for inverse reinforcement learning in continuous spaces

Angeliki Kamoutsi, Peter Schmitt-Förster, Tobias Sutter et al.

This work studies discrete-time discounted Markov decision processes with continuous state and action spaces and addresses the inverse problem of inferring a cost function from observed optimal behavior. We first consider the case in which we have access to the entire expert policy and characterize the set of solutions to the inverse problem by using occupation measures, linear duality, and complementary slackness conditions. To avoid trivial solutions and ill-posedness, we introduce a natural linear normalization constraint. This results in an infinite-dimensional linear feasibility problem, prompting a thorough analysis of its properties. Next, we use linear function approximators and adopt a randomized approach, namely the scenario approach and related probabilistic feasibility guarantees, to derive epsilon-optimal solutions for the inverse problem. We further discuss the sample complexity for a desired approximation accuracy. Finally, we deal with the more realistic case where we only have access to a finite set of expert demonstrations and a generative model and provide bounds on the error made when working with samples.

1.2MEMar 9, 2024

Online Identification of Stochastic Continuous-Time Wiener Models Using Sampled Data

Mohamed Abdalmoaty, Efe C. Balta, John Lygeros et al.

It is well known that ignoring the presence of stochastic disturbances in the identification of stochastic Wiener models leads to asymptotically biased estimators. On the other hand, optimal statistical identification, via likelihood-based methods, is sensitive to the assumptions on the data distribution and is usually based on relatively complex sequential Monte Carlo algorithms. We develop a simple recursive online estimation algorithm based on an output-error predictor, for the identification of continuous-time stochastic parametric Wiener models through stochastic approximation. The method is applicable to generic model parameterizations and, as demonstrated in the numerical simulation examples, it is robust with respect to the assumptions on the spectrum of the disturbance process.

16.2OCJan 25, 2024

Towards a Systems Theory of Algorithms

Florian Dörfler, Zhiyu He, Giuseppe Belgioioso et al.

Traditionally, numerical algorithms are seen as isolated pieces of code confined to an {\em in silico} existence. However, this perspective is not appropriate for many modern computational approaches in control, learning, or optimization, wherein {\em in vivo} algorithms interact with their environment. Examples of such {\em open algorithms} include various real-time optimization-based control strategies, reinforcement learning, decision-making architectures, online optimization, and many more. Further, even {\em closed} algorithms in learning or optimization are increasingly abstracted in block diagrams with interacting dynamic modules and pipelines. In this opinion paper, we state our vision on a to-be-cultivated {\em systems theory of algorithms} and argue in favor of viewing algorithms as open dynamical systems interacting with other algorithms, physical systems, humans, or databases. Remarkably, the manifold tools developed under the umbrella of systems theory are well suited for addressing a range of challenges in the algorithmic domain. We survey various instances where the principles of algorithmic systems theory are being developed and outline pertinent modeling, analysis, and design challenges.

11.8LGFeb 1, 2022

PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation

Matilde Gargiani, Andrea Zanelli, Andrea Martinelli et al.

Despite their success, policy gradient methods suffer from high variance of the gradient estimate, which can result in unsatisfactory sample complexity. Recently, numerous variance-reduced extensions of policy gradient methods with provably better sample complexity and competitive numerical performance have been proposed. After a compact survey on some of the main variance-reduced REINFORCE-type methods, we propose ProbAbilistic Gradient Estimation for Policy Gradient (PAGE-PG), a novel loopless variance-reduced policy gradient method based on a probabilistic switch between two types of updates. Our method is inspired by the PAGE estimator for supervised learning and leverages importance sampling to obtain an unbiased gradient estimator. We show that PAGE-PG enjoys a $\mathcal{O}\left( ε^{-3} \right)$ average sample complexity to reach an $ε$-stationary solution, which matches the sample complexity of its most competitive counterparts under the same setting. A numerical evaluation confirms the competitive performance of our method on classical control tasks.

1.8LGDec 31, 2021

Stochastic convex optimization for provably efficient apprenticeship learning

Angeliki Kamoutsi, Goran Banjac, John Lygeros

We consider large-scale Markov decision processes (MDPs) with an unknown cost function and employ stochastic convex optimization tools to address the problem of imitation learning, which consists of learning a policy from a finite set of expert demonstrations. We adopt the apprenticeship learning formalism, which carries the assumption that the true cost function can be represented as a linear combination of some known features. Existing inverse reinforcement learning algorithms come with strong theoretical guarantees, but are computationally expensive because they use reinforcement learning or planning algorithms as a subroutine. On the other hand, state-of-the-art policy gradient based algorithms (like IM-REINFORCE, IM-TRPO, and GAIL), achieve significant empirical success in challenging benchmark tasks, but are not well understood in terms of theory. With an emphasis on non-asymptotic guarantees of performance, we propose a method that directly learns a policy from expert demonstrations, bypassing the intermediate step of learning the cost function, by formulating the problem as a single convex optimization problem over occupancy measures. We develop a computationally efficient algorithm and derive high confidence regret bounds on the quality of the extracted policy, utilizing results from stochastic convex optimization and recent works in approximate linear programming for solving forward MDPs.

9.2LGDec 28, 2021

Efficient Performance Bounds for Primal-Dual Reinforcement Learning from Demonstrations

Angeliki Kamoutsi, Goran Banjac, John Lygeros

We consider large-scale Markov decision processes with an unknown cost function and address the problem of learning a policy from a finite set of expert demonstrations. We assume that the learner is not allowed to interact with the expert and has no access to reinforcement signal of any kind. Existing inverse reinforcement learning methods come with strong theoretical guarantees, but are computationally expensive, while state-of-the-art policy optimization algorithms achieve significant empirical success, but are hampered by limited theoretical understanding. To bridge the gap between theory and practice, we introduce a novel bilinear saddle-point framework using Lagrangian duality. The proposed primal-dual viewpoint allows us to develop a model-free provably efficient algorithm through the lens of stochastic convex optimization. The method enjoys the advantages of simplicity of implementation, low memory requirements, and computational and sample complexities independent of the number of states. We further present an equivalent no-regret online-learning interpretation.

9.2LGOct 29, 2021

Physics-informed linear regression is competitive with two Machine Learning methods in residential building MPC

Felix Bünning, Benjamin Huber, Adrian Schalbetter et al.

Because physics-based building models are difficult to obtain as each building is individual, there is an increasing interest in generating models suitable for building MPC directly from measurement data. Machine learning methods have been widely applied to this problem and validated mostly in simulation; there are, however, few studies on a direct comparison of different models or validation in real buildings to be found in the literature. Methods that are indeed validated in application often lead to computationally complex non-convex optimization problems. Here we compare physics-informed Autoregressive-Moving-Average with Exogenous Inputs (ARMAX) models to Machine Learning models based on Random Forests and Input Convex Neural Networks and the resulting convex MPC schemes in experiments on a practical building application with the goal of minimizing energy consumption while maintaining occupant comfort, and in a numerical case study. We demonstrate that Predictive Control in general leads to savings between 26% and 49% of heating and cooling energy, compared to the building's baseline hysteresis controller. Moreover, we show that all model types lead to satisfactory control performance in terms of constraint satisfaction and energy reduction. However, we also see that the physics-informed ARMAX models have a lower computational burden, and a superior sample efficiency compared to the Machine Learning based models. Moreover, even if abundant training data is available, the ARMAX models have a significantly lower prediction error than the Machine Learning models, which indicates that the encoded physics-based prior of the former cannot independently be found by the latter.

2.3SYMar 25, 2021

Plasma Spray Process Parameters Configuration using Sample-efficient Batch Bayesian Optimization

Xavier Guidetti, Alisa Rupenyan, Lutz Fassl et al.

Recent work has shown constrained Bayesian optimization to be a powerful technique for the optimization of industrial processes. In complex manufacturing processes, the possibility to run extensive sequences of experiments with the goal of finding good process parameters is severely limited by the time required for quality evaluation of the produced parts. To accelerate the process parameter optimization, we introduce a parallel acquisition procedure tailored on the process characteristics. We further propose an algorithm that adapts to equipment status to improve run-to-run reproducibility. We validate our optimization method numerically and experimentally, and demonstrate that it can efficiently find input parameters that produce the desired outcome and minimize the process cost.

8.6SYJan 19, 2021

Safe and Efficient Model-free Adaptive Control via Bayesian Optimization

Christopher König, Matteo Turchetta, John Lygeros et al.

Adaptive control approaches yield high-performance controllers when a precise system model or suitable parametrizations of the controller are available. Existing data-driven approaches for adaptive control mostly augment standard model-based methods with additional information about uncertainties in the dynamics or about disturbances. In this work, we propose a purely data-driven, model-free approach for adaptive control. Tuning low-level controllers based solely on system data raises concerns on the underlying algorithm safety and computational performance. Thus, our approach builds on GoOSE, an algorithm for safe and sample-efficient Bayesian optimization. We introduce several computational and algorithmic modifications in GoOSE that enable its practical use on a rotational motion system. We numerically demonstrate for several types of disturbances that our approach is sample efficient, outperforms constrained Bayesian optimization in terms of safety, and achieves the performance optima computed by grid evaluation. We further demonstrate the proposed adaptive control approach experimentally on a rotational motion system.

15.7RONov 26, 2020

Learning from Simulation, Racing in Reality

Eugenio Chisari, Alexander Liniger, Alisa Rupenyan et al.

We present a reinforcement learning-based solution to autonomously race on a miniature race car platform. We show that a policy that is trained purely in simulation using a relatively simple vehicle model, including model randomization, can be successfully transferred to the real robotic setup. We achieve this by using novel policy output regularization approach and a lifted action space which enables smooth actions but still aggressive race car driving. We show that this regularized policy does outperform the Soft Actor Critic (SAC) baseline method, both in simulation and on the real car, but it is still outperformed by a Model Predictive Controller (MPC) state of the art method. The refinement of the policy with three hours of real-world interaction data allows the reinforcement learning policy to achieve lap times similar to the MPC controller while reducing track constraint violations by 50%.

10.3SYNov 26, 2020

Input Convex Neural Networks for Building MPC

Felix Bünning, Adrian Schalbetter, Ahmed Aboudonia et al.

Model Predictive Control in buildings can significantly reduce their energy consumption. The cost and effort necessary for creating and maintaining first principle models for buildings make data-driven modelling an attractive alternative in this domain. In MPC the models form the basis for an optimization problem whose solution provides the control signals to be applied to the system. The fact that this optimization problem has to be solved repeatedly in real-time implies restrictions on the learning architectures that can be used. Here, we adapt Input Convex Neural Networks that are generally only convex for one-step predictions, for use in building MPC. We introduce additional constraints to their structure and weights to achieve a convex input-output relationship for multistep ahead predictions. We assess the consequences of the additional constraints for the model accuracy and test the models in a real-life MPC experiment in an apartment in Switzerland. In two five-day cooling experiments, MPC with Input Convex Neural Networks is able to keep room temperatures within comfort constraints while minimizing cooling energy consumption.

20.1ROMar 10, 2020

Optimization-Based Hierarchical Motion Planning for Autonomous Racing

José L. Vázquez, Marius Brühlmeier, Alexander Liniger et al.

In this paper we propose a hierarchical controller for autonomous racing where the same vehicle model is used in a two level optimization framework for motion planning. The high-level controller computes a trajectory that minimizes the lap time, and the low-level nonlinear model predictive path following controller tracks the computed trajectory online. Following a computed optimal trajectory avoids online planning and enables fast computational times. The efficiency is further enhanced by the coupling of the two levels through a terminal constraint, computed in the high-level controller. Including this constraint in the real-time optimization level ensures that the prediction horizon can be shortened, while safety is guaranteed. This proves crucial for the experimental validation of the approach on a full size driverless race car. The vehicle in question won two international student racing competitions using the proposed framework; moreover, our hierarchical controller achieved an improvement of 20% in the lap time compared to the state of the art result achieved using a very similar car and track.

24.6ROMay 13, 2019Code

AMZ Driverless: The Full Autonomous Racing System

Juraj Kabzan, Miguel de la Iglesia Valls, Victor Reijgwart et al.

This paper presents the algorithms and system architecture of an autonomous racecar. The introduced vehicle is powered by a software stack designed for robustness, reliability, and extensibility. In order to autonomously race around a previously unknown track, the proposed solution combines state of the art techniques from different fields of robotics. Specifically, perception, estimation, and control are incorporated into one high-performance autonomous racecar. This complex robotic system, developed by AMZ Driverless and ETH Zurich, finished 1st overall at each competition we attended: Formula Student Germany 2017, Formula Student Italy 2018 and Formula Student Germany 2018. We discuss the findings and learnings from these competitions and present an experimental evaluation of each module of our solution.

1.2GNNov 22, 2018

Inference of the three-dimensional chromatin structure and its temporal behavior

Bianca-Cristina Cristescu, Zalán Borsos, John Lygeros et al.

Understanding the three-dimensional (3D) structure of the genome is essential for elucidating vital biological processes and their links to human disease. To determine how the genome folds within the nucleus, chromosome conformation capture methods such as HiC have recently been employed. However, computational methods that exploit the resulting high-throughput, high-resolution data are still suffering from important limitations. In this work, we explore the idea of manifold learning for the 3D chromatin structure inference and present a novel method, REcurrent Autoencoders for CHromatin 3D structure prediction (REACH-3D). Our framework employs autoencoders with recurrent neural units to reconstruct the chromatin structure. In comparison to existing methods, REACH-3D makes no transfer function assumption and permits dynamic analysis. Evaluating REACH-3D on synthetic data indicated high agreement with the ground truth. When tested on real experimental HiC data, REACH-3D recovered most faithfully the expected biological properties and obtained the highest correlation coefficient with microscopy measurements. Last, REACH-3D was applied to dynamic HiC data, where it successfully modeled chromatin conformation during the cell cycle.