Ugo Rosolia

h-index18

25papers

1,675citations

Novelty47%

AI Score47

Ranked #31,257 of 194,257 authors (top 16%)#138 in SY (top 8%)

25 Papers

1.2SYDec 14, 2017

Learning Model Predictive Control for iterative tasks. A Data-Driven Control Framework

Ugo Rosolia, Francesco Borrelli

A Learning Model Predictive Controller (LMPC) for iterative tasks is presented. The controller is reference-free and is able to improve its performance by learning from previous iterations. A safe set and a terminal cost function are used in order to guarantee recursive feasibility and non-increasing performance at each iteration. The paper presents the control design approach, and shows how to recursively construct terminal set and terminal cost from state and input trajectories of previous iterations. Simulation results show the effectiveness of the proposed control logic.

1.2SYApr 25, 2018

Adaptive MPC for Iterative Tasks

Monimoy Bujarbaruah, Xiaojing Zhang, Ugo Rosolia et al.

This paper proposes an Adaptive Learning Model Predictive Control strategy for uncertain constrained linear systems performing iterative tasks. The additive uncertainty is modeled as the sum of a bounded process noise and an unknown constant offset. As new data becomes available, the proposed algorithm iteratively adapts the believed domain of the unknown offset after each iteration. An MPC strategy robust to all feasible offsets is employed in order to guarantee recursive feasibility. We show that the adaptation of the feasible offset domain reduces conservatism of the proposed strategy, compared to classical robust MPC strategies. As a result, the controller performance improves. Performance is measured in terms of following trajectories with lower associated costs at each iteration. Numerical simulations highlight the main advantages of the proposed approach.

1.2SYJan 21, 2021

Sample-Based Learning Model Predictive Control for Linear Uncertain Systems

Ugo Rosolia, Francesco Borrelli

We present a sample-based Learning Model Predictive Controller (LMPC) for constrained uncertain linear systems subject to bounded additive disturbances. The proposed controller builds on earlier work on LMPC for deterministic systems. First, we introduce the design of the safe set and value function used to guarantee safety and performance improvement. Afterwards, we show how these quantities can be approximated using noisy historical data. The effectiveness of the proposed approach is demonstrated on a numerical example. We show that the proposed LMPC is able to safely explore the state space and to iteratively improve the worst-case closed-loop performance, while robustly satisfying state and input constraints.

1.2SYMar 20, 2019

Simple Policy Evaluation for Data-Rich Iterative Tasks

Ugo Rosolia, Xiaojing Zhang, Francesco Borrelli

A data-based policy for iterative control task is presented. The proposed strategy is model-free and can be applied whenever safe input and state trajectories of a system performing an iterative task are available. These trajectories, together with a user-defined cost function, are exploited to construct a piecewise affine approximation to the value function. Approximated value functions are then used to evaluate the control policy by solving a linear program. We show that for linear system subject to convex cost and constraints, the proposed strategy guarantees closed-loop constraint satisfaction and performance bounds on the closed-loop trajectory. We evaluate the proposed strategy in simulations and experiments, the latter carried out on the Berkeley Autonomous Race Car (BARC) platform. We show that the proposed strategy is able to reduce the computation time by one order of magnitude while achieving the same performance as our model-based control algorithm.

6.9ROMar 9, 2022

MLNav: Learning to Safely Navigate on Martian Terrains

Shreyansh Daftry, Neil Abcouwer, Tyler Del Sesto et al.

We present MLNav, a learning-enhanced path planning framework for safety-critical and resource-limited systems operating in complex environments, such as rovers navigating on Mars. MLNav makes judicious use of machine learning to enhance the efficiency of path planning while fully respecting safety constraints. In particular, the dominant computational cost in such safety-critical settings is running a model-based safety checker on the proposed paths. Our learned search heuristic can simultaneously predict the feasibility for all path options in a single run, and the model-based safety checker is only invoked on the top-scoring paths. We validate in high-fidelity simulations using both real Martian terrain data collected by the Perseverance rover, as well as a suite of challenging synthetic terrains. Our experiments show that: (i) compared to the baseline ENav path planner on board the Perserverance rover, MLNav can provide a significant improvement in multiple key metrics, such as a 10x reduction in collision checks when navigating real Martian terrains, despite being trained with synthetic terrains; and (ii) MLNav can successfully navigate highly challenging terrains where the baseline ENav fails to find a feasible path before timing out.

1.2MAFeb 25, 2017

A decentralized algorithm for control of autonomous agents coupled by feasibility constraints

Ugo Rosolia, Francesco Braghin, Andrew G. Alleyne et al.

In this paper a decentralized control algorithm for systems composed of $N$ dynamically decoupled agents, coupled by feasibility constraints, is presented. The control problem is divided into $N$ optimal control sub-problems and a communication scheme is proposed to decouple computations. The derivative of the solution of each sub-problem is used to approximate the evolution of the system allowing the algorithm to decentralize and parallelize computations. The effectiveness of the proposed algorithm is shown through simulations in a cooperative driving scenario.

4.4OCFeb 20, 2023

Solving Recurrent MIPs with Semi-supervised Graph Neural Networks

Konstantinos Benidis, Ugo Rosolia, Syama Rangapuram et al.

We propose an ML-based model that automates and expedites the solution of MIPs by predicting the values of variables. Our approach is motivated by the observation that many problem instances share salient features and solution structures since they differ only in few (time-varying) parameters. Examples include transportation and routing problems where decisions need to be re-optimized whenever commodity volumes or link costs change. Our method is the first to exploit the sequential nature of the instances being solved periodically, and can be trained with ``unlabeled'' instances, when exact solutions are unavailable, in a semi-supervised setting. Also, we provide a principled way of transforming the probabilistic predictions into integral solutions. Using a battery of experiments with representative binary MIPs, we show the gains of our model over other ML-based optimization approaches.

7.8SYApr 8

Failure-Aware Iterative Learning of State-Control Invariant Sets

Ahmad Amine, Nick-Marios T. Kokolakis, Ugo Rosolia et al.

In this paper, we address the problem of computing maximal state-control invariant sets using failing trajectories. We introduce the concept of state-control invariance, which extends control invariance from the state space to the joint state-control space. The maximal state-control invariant (MSCI) set simultaneously encodes the maximal control invariant set (MCI) and, for each state in the MCI, the set of control inputs that preserve invariance. We prove that the state projection of the MSCI is the MCI and the state-dependent sections of the MSCI are the admissible invariance-preserving inputs. Building on this framework, we develop a Failure-Aware Iterative Learning (FAIL) algorithm for deterministic linear time invariant systems with polytopic constraints. The algorithm iteratively updates a constraint set in the state-control space by learning predecessor halfspaces from one-step failing state-input pairs, without knowing the dynamics. For each failure, FAIL learns the violated halfspaces of the predecessor of the constraint set by a regression on failing trajectories. We prove that the learned constraint set converges monotonically to the MSCI. Numerical experiments on a double integrator system validate the proposed approach.

5.6ROFeb 18

SIT-LMPC: Safe Information-Theoretic Learning Model Predictive Control for Iterative Tasks

Zirui Zang, Ahmad Amine, Nick-Marios T. Kokolakis et al.

Robots executing iterative tasks in complex, uncertain environments require control strategies that balance robustness, safety, and high performance. This paper introduces a safe information-theoretic learning model predictive control (SIT-LMPC) algorithm for iterative tasks. Specifically, we design an iterative control framework based on an information-theoretic model predictive control algorithm to address a constrained infinite-horizon optimal control problem for discrete-time nonlinear stochastic systems. An adaptive penalty method is developed to ensure safety while balancing optimality. Trajectories from previous iterations are utilized to learn a value function using normalizing flows, which enables richer uncertainty modeling compared to Gaussian priors. SIT-LMPC is designed for highly parallel execution on graphics processing units, allowing efficient real-time optimization. Benchmark simulations and hardware experiments demonstrate that SIT-LMPC iteratively improves system performance while robustly satisfying system constraints.

7.5LGDec 14, 2021Code

CEM-GD: Cross-Entropy Method with Gradient Descent Planner for Model-Based Reinforcement Learning

Kevin Huang, Sahin Lale, Ugo Rosolia et al.

Current state-of-the-art model-based reinforcement learning algorithms use trajectory sampling methods, such as the Cross-Entropy Method (CEM), for planning in continuous control settings. These zeroth-order optimizers require sampling a large number of trajectory rollouts to select an optimal action, which scales poorly for large prediction horizons or high dimensional action spaces. First-order methods that use the gradients of the rewards with respect to the actions as an update can mitigate this issue, but suffer from local optima due to the non-convex optimization landscape. To overcome these issues and achieve the best of both worlds, we propose a novel planner, Cross-Entropy Method with Gradient Descent (CEM-GD), that combines first-order methods with CEM. At the beginning of execution, CEM-GD uses CEM to sample a significant amount of trajectory rollouts to explore the optimization landscape and avoid poor local minima. It then uses the top trajectories as initialization for gradient descent and applies gradient updates to each of these trajectories to find the optimal action sequence. At each subsequent time step, however, CEM-GD samples much fewer trajectories from CEM before applying gradient updates. We show that as the dimensionality of the planning problem increases, CEM-GD maintains desirable performance with a constant small number of samples by using the gradient information, while avoiding local optima using initially well-sampled trajectories. Furthermore, CEM-GD achieves better performance than CEM on a variety of continuous control benchmarks in MuJoCo with 100x fewer samples per time step, resulting in around 25% less computation time and 10% less memory usage. The implementation of CEM-GD is available at $\href{https://github.com/KevinHuang8/CEM-GD}{\text{https://github.com/KevinHuang8/CEM-GD}}$.

8.9ROMar 8, 2021Code

Learning to Control an Unstable System with One Minute of Data: Leveraging Gaussian Process Differentiation in Predictive Control

Ivan D. Jimenez Rodriguez, Ugo Rosolia, Aaron D. Ames et al.

We present a straightforward and efficient way to control unstable robotic systems using an estimated dynamics model. Specifically, we show how to exploit the differentiability of Gaussian Processes to create a state-dependent linearized approximation of the true continuous dynamics that can be integrated with model predictive control. Our approach is compatible with most Gaussian process approaches for system identification, and can learn an accurate model using modest amounts of training data. We validate our approach by learning the dynamics of an unstable system such as a segway with a 7-D state space and 2-D input space (using only one minute of data), and we show that the resulting controller is robust to unmodelled dynamics and disturbances, while state-of-the-art control methods based on nominal models can fail under small perturbations. Code is open sourced at https://github.com/learning-and-control/core .

26.6ROFeb 14, 2022Code

Autonomous Vehicles on the Edge: A Survey on Autonomous Vehicle Racing

Johannes Betz, Hongrui Zheng, Alexander Liniger et al.

The rising popularity of self-driving cars has led to the emergence of a new research field in the recent years: Autonomous racing. Researchers are developing software and hardware for high performance race vehicles which aim to operate autonomously on the edge of the vehicles limits: High speeds, high accelerations, low reaction times, highly uncertain, dynamic and adversarial environments. This paper represents the first holistic survey that covers the research in the field of autonomous racing. We focus on the field of autonomous racecars only and display the algorithms, methods and approaches that are used in the fields of perception, planning and control as well as end-to-end learning. Further, with an increasing number of autonomous racing competitions, researchers now have access to a range of high performance platforms to test and evaluate their autonomy algorithms. This survey presents a comprehensive overview of the current autonomous racing platforms emphasizing both the software-hardware co-evolution to the current stage. Finally, based on additional discussion with leading researchers in the field we conclude with a summary of open research challenges that will guide future researchers in this field.

5.3ROOct 3, 2021

Mixed Observable RRT: Multi-Agent Mission-Planning in Partially Observable Environments

Kasper Johansson, Ugo Rosolia, Wyatt Ubellacker et al.

This paper considers centralized mission-planning for a heterogeneous multi-agent system with the aim of locating a hidden target. We propose a mixed observable setting, consisting of a fully observable state-space and a partially observable environment, using a hidden Markov model. First, we construct rapidly exploring random trees (RRTs) to introduce the mixed observable RRT for finding plausible mission plans giving way-points for each agent. Leveraging this construction, we present a path-selection strategy based on a dynamic programming approach, which accounts for the uncertainty from partial observations and minimizes the expected cost. Finally, we combine the high-level plan with model predictive control algorithms to evaluate the approach on an experimental setup consisting of a quadruped robot and a drone. It is shown that agents are able to make intelligent decisions to explore the area efficiently and to locate the target through collaborative actions.

15.2SYSep 10, 2021Code

Interactive multi-modal motion planning with Branch Model Predictive Control

Yuxiao Chen, Ugo Rosolia, Wyatt Ubellacker et al.

Motion planning for autonomous robots and vehicles in presence of uncontrolled agents remains a challenging problem as the reactive behaviors of the uncontrolled agents must be considered. Since the uncontrolled agents usually demonstrate multimodal reactive behavior, the motion planner needs to solve a continuous motion planning problem under these behaviors, which contains a discrete element. We propose a branch Model Predictive Control (MPC) framework that plans over feedback policies to leverage the reactive behavior of the uncontrolled agent. In particular, a scenario tree is constructed from a finite set of policies of the uncontrolled agent, and the branch MPC solves for a feedback policy in the form of a trajectory tree, which shares the same topology as the scenario tree. Moreover, coherent risk measures such as the Conditional Value at Risk (CVaR) are used as a tuning knob to adjust the tradeoff between performance and robustness. The proposed branch MPC framework is tested on an overtake and lane change task and a merging task for autonomous vehicles in simulation, and on the motion planning of an autonomous quadruped robot alongside an uncontrolled quadruped in experiments. The result demonstrates interesting human-like behaviors, achieving a balance between safety and performance.

8.9AISep 9, 2021

Risk-Averse Decision Making Under Uncertainty

Mohamadreza Ahmadi, Ugo Rosolia, Michel D. Ingham et al.

A large class of decision making under uncertainty problems can be described via Markov decision processes (MDPs) or partially observable MDPs (POMDPs), with application to artificial intelligence and operations research, among others. Traditionally, policy synthesis techniques are proposed such that a total expected cost or reward is minimized or maximized. However, optimality in the total expected cost sense is only reasonable if system behavior in the large number of runs is of interest, which has limited the use of such policies in practical mission-critical scenarios, wherein large deviations from the expected behavior may lead to mission failure. In this paper, we consider the problem of designing policies for MDPs and POMDPs with objectives and constraints in terms of dynamic coherent risk measures, which we refer to as the constrained risk-averse problem. For MDPs, we reformulate the problem into a infsup problem via the Lagrangian framework and propose an optimization-based method to synthesize Markovian policies. For MDPs, we demonstrate that the formulated optimization problems are in the form of difference convex programs (DCPs) and can be solved by the disciplined convex-concave programming (DCCP) framework. We show that these results generalize linear programs for constrained MDPs with total discounted expected costs and constraints. For POMDPs, we show that, if the coherent risk measures can be defined as a Markov risk transition mapping, an infinite-dimensional optimization can be used to design Markovian belief-based policies. For stochastic finite-state controllers (FSCs), we show that the latter optimization simplifies to a (finite-dimensional) DCP and can be solved by the DCCP framework. We incorporate these DCPs in a policy iteration algorithm to design risk-averse FSCs for POMDPs.

15.2AIDec 4, 2020

Constrained Risk-Averse Markov Decision Processes

Mohamadreza Ahmadi, Ugo Rosolia, Michel D. Ingham et al.

We consider the problem of designing policies for Markov decision processes (MDPs) with dynamic coherent risk objectives and constraints. We begin by formulating the problem in a Lagrangian framework. Under the assumption that the risk objectives and constraints can be represented by a Markov risk transition mapping, we propose an optimization-based method to synthesize Markovian policies that lower-bound the constrained risk-averse problem. We demonstrate that the formulated optimization problems are in the form of difference convex programs (DCPs) and can be solved by the disciplined convex-concave programming (DCCP) framework. We show that these results generalize linear programs for constrained MDPs with total discounted expected costs and constraints. Finally, we illustrate the effectiveness of the proposed method with numerical experiments on a rover navigation problem involving conditional-value-at-risk (CVaR) and entropic-value-at-risk (EVaR) coherent risk measures.

13.0RONov 19, 2020

Decentralized Task and Path Planning for Multi-Robot Systems

Yuxiao Chen, Ugo Rosolia, Aaron D. Ames

We consider a multi-robot system with a team of collaborative robots and multiple tasks that emerges over time. We propose a fully decentralized task and path planning (DTPP) framework consisting of a task allocation module and a localized path planning module. Each task is modeled as a Markov Decision Process (MDP) or a Mixed Observed Markov Decision Process (MOMDP) depending on whether full states or partial states are observable. The task allocation module then aims at maximizing the expected pure reward (reward minus cost) of the robotic team. We fuse the Markov model into a factor graph formulation so that the task allocation can be decentrally solved using the max-sum algorithm. Each robot agent follows the optimal policy synthesized for the Markov model and we propose a localized forward dynamic programming scheme that resolves conflicts between agents and avoids collisions. The proposed framework is demonstrated with high fidelity ROS simulations and experiments with multiple ground robots.

11.3RONov 6, 2020

Reactive motion planning with probabilistic safety guarantees

Yuxiao Chen, Ugo Rosolia, Chuchu Fan et al.

Motion planning in environments with multiple agents is critical to many important autonomous applications such as autonomous vehicles and assistive robots. This paper considers the problem of motion planning, where the controlled agent shares the environment with multiple uncontrolled agents. First, a predictive model of the uncontrolled agents is trained to predict all possible trajectories within a short horizon based on the scenario. The prediction is then fed to a motion planning module based on model predictive control. We proved generalization bound for the predictive model using three different methods, post-bloating, support vector machine (SVM), and conformal analysis, all capable of generating stochastic guarantees of the correctness of the predictor. The proposed approach is demonstrated in simulation in a scenario emulating autonomous highway driving.

4.3SYApr 6, 2020

Control of Unknown Nonlinear Systems with Linear Time-Varying MPC

Dimitris Papadimitriou, Ugo Rosolia, Francesco Borrelli

We present a Model Predictive Control (MPC) strategy for unknown input-affine nonlinear dynamical systems. A non-parametric method is used to estimate the nonlinear dynamics from observed data. The estimated nonlinear dynamics are then linearized over time varying regions of the state space to construct an Affine Time Varying (ATV) model. Error bounds arising from the estimation and linearization procedure are computed by using sampling techniques. The ATV model and the uncertainty sets are used to design a robust Model Predictive Control (MPC) problem which guarantees safety for the unknown system with high probability. A simple nonlinear example demonstrates the effectiveness of the approach where commonly used linearization methods fail.

6.6SYApr 2, 2020Code

Trajectory Optimization for Nonlinear Multi-Agent Systems using Decentralized Learning Model Predictive Control

Edward L. Zhu, Yvonne R. Stürz, Ugo Rosolia et al.

We present a decentralized minimum-time trajectory optimization scheme based on learning model predictive control for multi-agent systems with nonlinear decoupled dynamics and coupled state constraints. By performing the same task iteratively, data from previous task executions is used to construct and improve local time-varying safe sets and an approximate value function. These are used in a decoupled MPC problem as terminal sets and terminal cost functions. Our framework results in a decentralized controller, which requires no communication between agents over each iteration of task execution, and guarantees persistent feasibility, finite-time closed-loop convergence, and non-decreasing performance of the global system over task iterations. Numerical experiments of a multi-vehicle collision avoidance scenario demonstrate the effectiveness of the proposed scheme.

7.3SYMar 3, 2020

ABC-LMPC: Safe Sample-Based Learning MPC for Stochastic Nonlinear Dynamical Systems with Adjustable Boundary Conditions

Brijen Thananjeyan, Ashwin Balakrishna, Ugo Rosolia et al.

Sample-based learning model predictive control (LMPC) strategies have recently attracted attention due to their desirable theoretical properties and their good empirical performance on robotic tasks. However, prior analysis of LMPC controllers for stochastic systems has mainly focused on linear systems in the iterative learning control setting. We present a novel LMPC algorithm, Adjustable Boundary Condition LMPC (ABC-LMPC), which enables rapid adaptation to novel start and goal configurations and theoretically show that the resulting controller guarantees iterative improvement in expectation for stochastic nonlinear systems. We present results with a practical instantiation of this algorithm and experimentally demonstrate that the resulting controller adapts to a variety of initial and terminal conditions on 3 stochastic continuous control tasks.

20.5LGMay 31, 2019

Safety Augmented Value Estimation from Demonstrations (SAVED): Safe Deep Model-Based RL for Sparse Cost Robotic Tasks

Brijen Thananjeyan, Ashwin Balakrishna, Ugo Rosolia et al.

Reinforcement learning (RL) for robotics is challenging due to the difficulty in hand-engineering a dense cost function, which can lead to unintended behavior, and dynamical uncertainty, which makes exploration and constraint satisfaction challenging. We address these issues with a new model-based reinforcement learning algorithm, Safety Augmented Value Estimation from Demonstrations (SAVED), which uses supervision that only identifies task completion and a modest set of suboptimal demonstrations to constrain exploration and learn efficiently while handling complex constraints. We then compare SAVED with 3 state-of-the-art model-based and model-free RL algorithms on 6 standard simulation benchmarks involving navigation and manipulation and a physical knot-tying task on the da Vinci surgical robot. Results suggest that SAVED outperforms prior methods in terms of success rate, constraint satisfaction, and sample efficiency, making it feasible to safely learn a control policy directly on a real robot in less than an hour. For tasks on the robot, baselines succeed less than 5% of the time while SAVED has a success rate of over 75% in the first 50 training iterations. Code and supplementary material is available at https://tinyurl.com/saved-rl.

17.2OCFeb 23, 2017

Learning Model Predictive Control for Iterative Tasks: A Computationally Efficient Approach for Linear System

Ugo Rosolia, Francesco Borrelli

A Learning Model Predictive Controller (LMPC) for linear system in presented. The proposed controller is an extension of the LMPC [1] and it aims to decrease the computational burden. The control scheme is reference-free and is able to improve its performance by learning from previous iterations. A convex safe set and a terminal cost function are used in order to guarantee recursive feasibility and non-increasing performance at each iteration. The paper presents the control design approach, and shows how to recursively construct the convex terminal set and the terminal cost from state and input trajectories of previous iterations. Simulation results show the effectiveness of the proposed control logic.

13.4LGOct 20, 2016

Autonomous Racing using Learning Model Predictive Control

Ugo Rosolia, Ashwin Carvalho, Francesco Borrelli

A novel learning Model Predictive Control technique is applied to the autonomous racing problem. The goal of the controller is to minimize the time to complete a lap. The proposed control strategy uses the data from previous laps to improve its performance while satisfying safety requirements. Moreover, a system identification technique is proposed to estimate the vehicle dynamics. Simulation results with the high fidelity simulator software CarSim show the effectiveness of the proposed control scheme.

24.1SYSep 6, 2016

Learning Model Predictive Control for iterative tasks. A Data-Driven Control Framework

Ugo Rosolia, Francesco Borrelli