SYApr 25, 2018
Adaptive MPC for Iterative TasksMonimoy Bujarbaruah, Xiaojing Zhang, Ugo Rosolia et al.
This paper proposes an Adaptive Learning Model Predictive Control strategy for uncertain constrained linear systems performing iterative tasks. The additive uncertainty is modeled as the sum of a bounded process noise and an unknown constant offset. As new data becomes available, the proposed algorithm iteratively adapts the believed domain of the unknown offset after each iteration. An MPC strategy robust to all feasible offsets is employed in order to guarantee recursive feasibility. We show that the adaptation of the feasible offset domain reduces conservatism of the proposed strategy, compared to classical robust MPC strategies. As a result, the controller performance improves. Performance is measured in terms of following trajectories with lower associated costs at each iteration. Numerical simulations highlight the main advantages of the proposed approach.
SYNov 30, 2018
Adaptive MPC for Autonomous Lane KeepingMonimoy Bujarbaruah, Xiaojing Zhang, H. Eric Tseng et al.
This paper proposes an Adaptive Robust Model Predictive Control strategy for lateral control in lane keeping problems, where we continuously learn an unknown, but constant steering angle offset present in the steering system. Longitudinal velocity is assumed constant. The goal is to minimize the outputs, which are distance from lane center line and the steady state heading angle error, while satisfying respective safety constraints. We do not assume perfect knowledge of the vehicle lateral dynamics model and estimate and adapt in real-time the maximum possible bound of the steering angle offset from data using a robust Set Membership Method based approach. Our approach is even well-suited for scenarios with sharp curvatures on high speed, where obtaining a precise model bias for constrained control is difficult, but learning from data can be helpful. We ensure persistent feasibility using a switching strategy during change of lane curvature. The proposed methodology is general and can be applied to more complex vehicle dynamics problems.
SYApr 25, 2018
Adaptive MPC with Chance Constraints for FIR SystemsMonimoy Bujarbaruah, Xiaojing Zhang, Francesco Borrelli
This paper proposes an adaptive stochastic Model Predictive Control (MPC) strategy for stable linear time invariant systems in the presence of bounded disturbances. We consider multi-input multi-output systems that can be expressed by a finite impulse response model, whose parameters we estimate using a linear Recursive Least Squares algorithm. Building on the work of [1],[2], our approach is able to handle hard input constraints and probabilistic output constraints. By using tools from distributionally robust optimization, we formulate our MPC design task as a convex optimization problem that can be solved using existing tools. Furthermore, we show that our adaptive stochastic MPC algorithm is persistently feasible. The efficacy of the developed algorithm is demonstrated in a numerical example and the results are compared with the adaptive robust MPC algorithm of [2].
ROMar 7, 2021Code
Learning Environment Constraints in Collaborative Robotics: A Decentralized Leader-Follower ApproachMonimoy Bujarbaruah, Yvonne R. Stürz, Conrad Holda et al.
In this paper, we propose a leader-follower hierarchical strategy for two robots collaboratively transporting an object in a partially known environment with obstacles. Both robots sense the local surrounding environment and react to obstacles in their proximity. We consider no explicit communication, so the local environment information and the control actions are not shared between the robots. At any given time step, the leader solves a model predictive control (MPC) problem with its known set of obstacles and plans a feasible trajectory to complete the task. The follower estimates the inputs of the leader and uses a policy to assist the leader while reacting to obstacles in its proximity. The leader infers obstacles in the follower's vicinity by using the difference between the predicted and the real-time estimated follower control action. A method to switch the leader-follower roles is used to improve the control performance in tight environments. The efficacy of our approach is demonstrated with detailed comparisons to two alternative strategies, where it achieves the highest success rate, while completing the task fastest. See the link www.dropbox.com/s/hexadigqkvspaeh/IROS_Video.mp4?dl=0 for a descriptive video of the algorithm.
SYNov 20, 2020
Learning How to Solve Bubble BallHotae Lee, Monimoy Bujarbaruah, Francesco Borrelli
"Bubble Ball" is a game built on a 2D physics engine, where a finite set of objects can modify the motion of a bubble-like ball. The objective is to choose the set and the initial configuration of the objects, in order to get the ball to reach a target flag. The presence of obstacles, friction, contact forces and combinatorial object choices make the game hard to solve. In this paper, we propose a hierarchical predictive framework which solves Bubble Ball. Geometric, kinematic and dynamic models are used at different levels of the hierarchy. At each level of the game, data collected during failed iterations are used to update models at all hierarchical level and converge to a feasible solution to the game. The proposed approach successfully solves a large set of Bubble Ball levels within reasonable number of trials. This proposed framework can also be used to solve other physics-based games, especially with limited training data from human demonstrations.
ROSep 9, 2020
Traction Adaptive Motion Planning at the Limits of HandlingLars Svensson, Monimoy Bujarbaruah, Arpit Karsolia et al.
In this paper, we address the problem of motion planning and control at the limits of handling, under locally varying traction conditions. We propose a novel solution method where traction variations over the prediction horizon are represented by time-varying tire force constraints, derived from a predictive friction estimate. A constrained finite time optimal control problem is solved in a receding horizon fashion, imposing these time-varying constraints. Furthermore, our method features an integrated sampling augmentation procedure that addresses the problems of infeasibility and sensitivity to local minima that arise at abrupt constraint alterations, e.g., due to sudden friction changes. We validate the proposed algorithm on a Volvo FH16 heavy-duty vehicle, in a range of critical scenarios. Experimental results indicate that traction adaptive motion planning and control improves the vehicle's capacity to avoid accidents, both when adapting to low local traction, by ensuring dynamic feasibility of the planned motion, and when adapting to high local traction, by realizing high traction utilization.
ROJul 19, 2020Code
Learning to Play Cup-and-Ball with Noisy Camera ObservationsMonimoy Bujarbaruah, Tony Zheng, Akhil Shetty et al.
Playing the cup-and-ball game is an intriguing task for robotics research since it abstracts important problem characteristics including system nonlinearity, contact forces and precise positioning as terminal goal. In this paper, we present a learning model based control strategy for the cup-and-ball game, where a Universal Robots UR5e manipulator arm learns to catch a ball in one of the cups on a Kendama. Our control problem is divided into two sub-tasks, namely $(i)$ swinging the ball up in a constrained motion, and $(ii)$ catching the free-falling ball. The swing-up trajectory is computed offline, and applied in open-loop to the arm. Subsequently, a convex optimization problem is solved online during the ball's free-fall to control the manipulator and catch the ball. The controller utilizes noisy position feedback of the ball from an Intel RealSense D435 depth camera. We propose a novel iterative framework, where data is used to learn the support of the camera noise distribution iteratively in order to update the control policy. The probability of a catch with a fixed policy is computed empirically with a user specified number of roll-outs. Our design guarantees that probability of the catch increases in the limit, as the learned support nears the true support of the camera noise distribution. High-fidelity Mujoco simulations and preliminary experimental results support our theoretical analysis.
SYJun 9, 2020Code
Learning to Satisfy Unknown Constraints in Iterative MPCMonimoy Bujarbaruah, Charlott Vallon, Francesco Borrelli
We propose a control design method for linear time-invariant systems that iteratively learns to satisfy unknown polyhedral state constraints. At each iteration of a repetitive task, the method constructs an estimate of the unknown environment constraints using collected closed-loop trajectory data. This estimated constraint set is improved iteratively upon collection of additional data. An MPC controller is then designed to robustly satisfy the estimated constraint set. This paper presents the details of the proposed approach, and provides robust and probabilistic guarantees of constraint satisfaction as a function of the number of executed task iterations. We demonstrate the safety of the proposed framework and explore the safety vs. performance trade-off in a detailed numerical example.
SYDec 9, 2019
Exploiting Model Sparsity in Adaptive MPC: A Compressed Sensing ViewpointMonimoy Bujarbaruah, Charlott Vallon
This paper proposes an Adaptive Stochastic Model Predictive Control (MPC) strategy for stable linear time-invariant systems in the presence of bounded disturbances. We consider multi-input, multi-output systems that can be expressed by a Finite Impulse Response (FIR) model. The parameters of the FIR model corresponding to each output are unknown but assumed sparse. We estimate these parameters using the Recursive Least Squares algorithm. The estimates are then improved using set-based bounds obtained by solving the Basis Pursuit Denoising [1] problem. Our approach is able to handle hard input constraints and probabilistic output constraints. Using tools from distributionally robust optimization, we reformulate the probabilistic output constraints as tractable convex second-order cone constraints, which enables us to pose our MPC design task as a convex optimization problem. The efficacy of the developed algorithm is highlighted with a thorough numerical example, where we demonstrate performance gain over the counterpart algorithm of [2], which does not utilize the sparsity information of the system impulse response parameters during control design.
SYSep 11, 2019
Relaxed Actor-Critic with Convergence Guarantees for Continuous-Time Optimal Control of Nonlinear SystemsJingliang Duan, Jie Li, Qiang Ge et al.
This paper presents the Relaxed Continuous-Time Actor-critic (RCTAC) algorithm, a method for finding the nearly optimal policy for nonlinear continuous-time (CT) systems with known dynamics and infinite horizon, such as the path-tracking control of vehicles. RCTAC has several advantages over existing adaptive dynamic programming algorithms for CT systems. It does not require the ``admissibility" of the initialized policy or the input-affine nature of controlled systems for convergence. Instead, given any initial policy, RCTAC can converge to an admissible, and subsequently nearly optimal policy for a general nonlinear system with a saturated controller. RCTAC consists of two phases: a warm-up phase and a generalized policy iteration phase. The warm-up phase minimizes the square of the Hamiltonian to achieve admissibility, while the generalized policy iteration phase relaxes the update termination conditions for faster convergence. The convergence and optimality of the algorithm are proven through Lyapunov analysis, and its effectiveness is demonstrated through simulations and real-world path-tracking tasks.
LGJun 19, 2019
Safe and Near-Optimal Policy Learning for Model Predictive Control using Primal-Dual Neural NetworksXiaojing Zhang, Monimoy Bujarbaruah, Francesco Borrelli
In this paper, we propose a novel framework for approximating the explicit MPC law for linear parameter-varying systems using supervised learning. In contrast to most existing approaches, we not only learn the control policy, but also a "certificate policy", that allows us to estimate the sub-optimality of the learned control policy online, during execution-time. We learn both these policies from data using supervised learning techniques, and also provide a randomized method that allows us to guarantee the quality of each learned policy, measured in terms of feasibility and optimality. This in turn allows us to bound the probability of the learned control policy of being infeasible or suboptimal, where the check is performed by the certificate policy. Since our algorithm does not require the solution of an optimization problem during run-time, it can be deployed even on resource-constrained systems. We illustrate the efficacy of the proposed framework on a vehicle dynamics control problem where we demonstrate a speedup of up to two orders of magnitude compared to online optimization with minimal performance degradation.
ROMar 11, 2019
Adaptive Trajectory Planning and Optimization at Limits of HandlingLars Svensson, Monimoy Bujarbaruah, Nitin Kapania et al.
In this paper, we tackle the problem of trajectory planning and control of a vehicle under locally varying traction limitations, in the presence of suddenly appearing obstacles. We employ concepts from adaptive model predictive control for run-time adaptation of tire force constraints that are imposed by local traction conditions. To solve the resulting optimization problem for real-time control synthesis with such time varying constraints, we propose a novel numerical scheme based on Real Time Iteration Sequential Quadratic Programming (RTI-SQP), which we call Sampling Augmented Adaptive RTI (SAA-RTI). Sampling augmentation of conventional RTI-SQP provides additional feasible candidate trajectories for warmstarting the optimization procedure. Thus, the proposed SAA-RTI algorithm enables real time constraint adaptation and reduces sensitivity to local minima. Through extensive numerical simulations we demonstrate that our method increases the vehicle's capacity to avoid accidents in scenarios with unanticipated obstacles and locally varying traction, compared to equivalent non-adaptive control schemes and traditional planning and tracking approaches.