Moritz Diehl

RO
h-index29
24papers
408citations
Novelty53%
AI Score50

24 Papers

RODec 6, 2022Code
Safe Imitation Learning of Nonlinear Model Predictive Control for Flexible Robots

Shamil Mamedov, Rudolf Reiter, Seyed Mahdi Basiri Azad et al.

Flexible robots may overcome some of the industry's major challenges, such as enabling intrinsically safe human-robot collaboration and achieving a higher payload-to-mass ratio. However, controlling flexible robots is complicated due to their complex dynamics, which include oscillatory behavior and a high-dimensional state space. Nonlinear model predictive control (NMPC) offers an effective means to control such robots, but its significant computational demand often limits its application in real-time scenarios. To enable fast control of flexible robots, we propose a framework for a safe approximation of NMPC using imitation learning and a predictive safety filter. Our framework significantly reduces computation time while incurring a slight loss in performance. Compared to NMPC, our framework shows more than an eightfold improvement in computation time when controlling a three-dimensional flexible robot arm in simulation, all while guaranteeing safety constraints. Notably, our approach outperforms state-of-the-art reinforcement learning methods. The development of fast and safe approximate NMPC holds the potential to accelerate the adoption of flexible robots in industry. The project code is available at: tinyurl.com/anmpc4fr

OCSep 15, 2011
Combining Convex-Concave Decompositions and Linearization Approaches for solving BMIs, with application to Static Output Feedback

Quoc Tran Dinh, Suat Gumussoy, Wim Michiels et al.

A novel optimization method is proposed to minimize a convex function subject to bilinear matrix inequality (BMI) constraints. The key idea is to decompose the bilinear mapping as a difference between two positive semidefinite convex mappings. At each iteration of the algorithm the concave part is linearized, leading to a convex subproblem.Applications to various output feedback controller synthesis problems are presented. In these applications the subproblem in each iteration step can be turned into a convex optimization problem with linear matrix inequality (LMI) constraints. The performance of the algorithm has been benchmarked on the data from COMPleib library.

OCJul 28, 2011
Sequential Convex Programming Methods for Solving Nonlinear Optimization Problems with DC constraints

Tran Dinh Quoc, Moritz Diehl

This paper investigates the relation between sequential convex programming (SCP) as, e.g., defined in [24] and DC (difference of two convex functions) programming. We first present an SCP algorithm for solving nonlinear optimization problems with DC constraints and prove its convergence. Then we combine the proposed algorithm with a relaxation technique to handle inconsistent linearizations. Numerical tests are performed to investigate the behaviour of the class of algorithms.

ROOct 23, 2022
Active Learning of Discrete-Time Dynamics for Uncertainty-Aware Model Predictive Control

Alessandro Saviolo, Jonathan Frey, Abhishek Rathod et al.

Model-based control requires an accurate model of the system dynamics for precisely and safely controlling the robot in complex and dynamic environments. Moreover, in the presence of variations in the operating conditions, the model should be continuously refined to compensate for dynamics changes. In this paper, we present a self-supervised learning approach that actively models the dynamics of nonlinear robotic systems. We combine offline learning from past experience and online learning from current robot interaction with the unknown environment. These two ingredients enable a highly sample-efficient and adaptive learning process, capable of accurately inferring model dynamics in real-time even in operating regimes that greatly differ from the training distribution. Moreover, we design an uncertainty-aware model predictive controller that is heuristically conditioned to the aleatoric (data) uncertainty of the learned dynamics. This controller actively chooses the optimal control actions that (i) optimize the control performance and (ii) improve the efficiency of online learning sample collection. We demonstrate the effectiveness of our method through a series of challenging real-world experiments using a quadrotor system. Our approach showcases high resilience and generalization capabilities by consistently adapting to unseen flight conditions, while it significantly outperforms classical and adaptive control baselines.

OCNov 30, 2011
Combining Lagrangian Decomposition and Excessive Gap Smoothing Technique for Solving Large-Scale Separable Convex Optimization Problems

Tran Dinh Quoc, Carlo Savorgnan, Moritz Diehl

A new algorithm for solving large-scale convex optimization problems with a separable objective function is proposed. The basic idea is to combine three techniques: Lagrangian dual decomposition, excessive gap and smoothing. The main advantage of this algorithm is that it dynamically updates the smoothness parameters which leads to numerically robust performance. The convergence of the algorithm is proved under weak conditions imposed on the original problem. The rate of convergence is $O(\frac{1}{k})$, where $k$ is the iteration counter. In the second part of the paper, the algorithm is coupled with a dual scheme to construct a switching variant of the dual decomposition. We discuss implementation issues and make a theoretical comparison. Numerical examples confirm the theoretical results.

OCMay 17, 2011
Real-Time Sequential Convex Programming for Optimal Control Applications

Tran Dinh Quoc, Carlo Savorgnan, Moritz Diehl

This paper proposes real-time sequential convex programming (RTSCP), a method for solving a sequence of nonlinear optimization problems depending on an online parameter. We provide a contraction estimate for the proposed method and, as a byproduct, a new proof of the local convergence of sequential convex programming. The approach is illustrated by an example where RTSCP is applied to nonlinear model predictive control.

LGApr 3, 2023
Imitation Learning from Nonlinear MPC via the Exact Q-Loss and its Gauss-Newton Approximation

Andrea Ghezzi, Jasper Hoffman, Jonathan Frey et al.

This work presents a novel loss function for learning nonlinear Model Predictive Control policies via Imitation Learning. Standard approaches to Imitation Learning neglect information about the expert and generally adopt a loss function based on the distance between expert and learned controls. In this work, we present a loss based on the Q-function directly embedding the performance objectives and constraint satisfaction of the associated Optimal Control Problem (OCP). However, training a Neural Network with the Q-loss requires solving the associated OCP for each new sample. To alleviate the computational burden, we derive a second Q-loss based on the Gauss-Newton approximation of the OCP resulting in a faster training time. We validate our losses against Behavioral Cloning, the standard approach to Imitation Learning, on the control of a nonlinear system with constraints. The final results show that the Q-function-based losses significantly reduce the amount of constraint violations while achieving comparable or better closed-loop costs.

SYNov 27, 2025
L4acados: Learning-based models for acados, applied to Gaussian process-based predictive control

Amon Lahr, Joshua Näf, Kim P. Wabersich et al.

Incorporating learning-based models, such as artificial neural networks or Gaussian processes, into model predictive control (MPC) strategies can significantly improve control performance and online adaptation capabilities for real-world applications. Still, enabling state-of-the-art implementations of learning-based models for MPC is complicated by the challenge of interfacing machine learning frameworks with real-time optimal control software. This work aims at filling this gap by incorporating external sensitivities in sequential quadratic programming solvers for nonlinear optimal control. To this end, we provide L4acados, a general framework for incorporating Python-based dynamics models in the real-time optimal control software acados. By computing external sensitivities via a user-defined Python module, L4acados enables the implementation of MPC controllers with learning-based residual models in acados, while supporting parallelization of sensitivity computations when preparing the quadratic subproblems. We demonstrate significant speed-ups and superior scaling properties of L4acados compared to available software using a neural-network-based control example. Last, we provide an efficient and modular real-time implementation of Gaussian process-based MPC using L4acados, which is applied to two hardware examples: autonomous miniature racing, as well as motion control of a full-scale autonomous vehicle for an ISO lane change maneuver.

24.4ROMar 23
Semi-Infinite Programming for Collision-Avoidance in Optimal and Model Predictive Control

Yunfan Gao, Florian Messerer, Niels van Duijkeren et al.

This paper presents a novel approach for collision avoidance in optimal and model predictive control, in which the environment is represented by a large number of points and the robot as a union of padded polygons. The conditions that none of the points shall collide with the robot can be written in terms of an infinite number of constraints per obstacle point. We show that the resulting semi-infinite programming (SIP) optimal control problem (OCP) can be efficiently tackled through a combination of two methods: local reduction and an external active-set method. Specifically, this involves iteratively identifying the closest point obstacles, determining the lower-level distance minimizer among all feasible robot shape parameters, and solving the upper-level finitely-constrained subproblems. In addition, this paper addresses robust collision avoidance in the presence of ellipsoidal state uncertainties. Enforcing constraint satisfaction over all possible uncertainty realizations extends the dimension of constraint infiniteness. The infinitely many constraints arising from translational uncertainty are handled by local reduction together with the robot shape parameterization, while rotational uncertainty is addressed via a backoff reformulation. A controller implemented based on the proposed method is demonstrated on a real-world robot running at 20Hz, enabling fast and collision-free navigation in tight spaces. An application to 3D collision avoidance is also demonstrated in simulation.

OCMay 2, 2025Code
Differentiable Nonlinear Model Predictive Control

Jonathan Frey, Katrin Baumgärtner, Gianluca Frison et al.

The efficient computation of parametric solution sensitivities is a key challenge in the integration of learning-enhanced methods with nonlinear model predictive control (MPC), as their availability is crucial for many learning algorithms. While approaches presented in the machine learning community are limited to convex or unconstrained formulations, this paper discusses the computation of solution sensitivities of general nonlinear programs (NLPs) using the implicit function theorem (IFT) and smoothed optimality conditions treated in interior-point methods (IPM). We detail sensitivity computation within a sequential quadratic programming (SQP) method which employs an IPM for the quadratic subproblems. The publication is accompanied by an efficient open-source implementation within the framework, providing both forward and adjoint sensitivities for general optimal control problems, achieving speedups exceeding 3x over the state-of-the-art solver mpc.pytorch.

73.9SYApr 20
Identification of a Kalman filter: consistency of local solutions

Léo Simpson, Moritz Diehl

Prediction error and maximum likelihood methods are powerful tools for identifying linear dynamical systems and, in particular, enable the joint estimation of model parameters and the Kalman filter used for state estimation. A key limitation, however, is that these methods require solving a generally non-convex optimization problem to global optimality. This paper analyzes the statistical behavior of local minimizers in the special case where only the Kalman gain is estimated. We prove that these local solutions are statistically consistent estimates of the true Kalman gain. This follows from asymptotic unimodality: as the dataset grows, the objective function converges to a limit with a unique local (and therefore global) minimizer. We further provide guidelines for designing the optimization problem for Kalman filter tuning and discuss extensions to the joint estimation of additional linear parameters and noise covariances. Finally, the theoretical results are illustrated using three examples of increasing complexity. The main practical takeaway of this paper is that difficulties caused by local minimizers in system identification are, at least, not attributable to the tuning of the Kalman gain.

SYFeb 4, 2025
Synthesis of Model Predictive Control and Reinforcement Learning: Survey and Classification

Rudolf Reiter, Jasper Hoffmann, Dirk Reinhardt et al.

The fields of MPC and RL consider two successful control techniques for Markov decision processes. Both approaches are derived from similar fundamental principles, and both are widely used in practical applications, including robotics, process control, energy systems, and autonomous driving. Despite their similarities, MPC and RL follow distinct paradigms that emerged from diverse communities and different requirements. Various technical discrepancies, particularly the role of an environment model as part of the algorithm, lead to methodologies with nearly complementary advantages. Due to their orthogonal benefits, research interest in combination methods has recently increased significantly, leading to a large and growing set of complex ideas leveraging MPC and RL. This work illuminates the differences, similarities, and fundamentals that allow for different combination algorithms and categorizes existing work accordingly. Particularly, we focus on the versatile actor-critic RL approach as a basis for our categorization and examine how the online optimization approach of MPC can be used to improve the overall closed-loop performance of a policy.

SYJun 6, 2024
AC4MPC: Actor-Critic Reinforcement Learning for Nonlinear Model Predictive Control

Rudolf Reiter, Andrea Ghezzi, Katrin Baumgärtner et al.

\Ac{MPC} and \ac{RL} are two powerful control strategies with, arguably, complementary advantages. In this work, we show how actor-critic \ac{RL} techniques can be leveraged to improve the performance of \ac{MPC}. The \ac{RL} critic is used as an approximation of the optimal value function, and an actor roll-out provides an initial guess for primal variables of the \ac{MPC}. A parallel control architecture is proposed where each \ac{MPC} instance is solved twice for different initial guesses. Besides the actor roll-out initialization, a shifted initialization from the previous solution is used. Thereafter, the actor and the critic are again used to approximately evaluate the infinite horizon cost of these trajectories. The control actions from the lowest-cost trajectory are applied to the system at each time step. We establish that the proposed algorithm is guaranteed to outperform the original \ac{RL} policy plus an error term that depends on the accuracy of the critic and decays with the horizon length of the \ac{MPC} formulation. Moreover, we do not require globally optimal solutions for these guarantees to hold. The approach is demonstrated on an illustrative toy example and an \ac{AD} overtaking scenario.

LGNov 20, 2020
Convergence Analysis of Homotopy-SGD for non-convex optimization

Matilde Gargiani, Andrea Zanelli, Quoc Tran-Dinh et al.

First-order stochastic methods for solving large-scale non-convex optimization problems are widely used in many big-data applications, e.g. training deep neural networks as well as other complex and potentially non-convex machine learning models. Their inexpensive iterations generally come together with slow global convergence rate (mostly sublinear), leading to the necessity of carrying out a very high number of iterations before the iterates reach a neighborhood of a minimizer. In this work, we present a first-order stochastic algorithm based on a combination of homotopy methods and SGD, called Homotopy-Stochastic Gradient Descent (H-SGD), which finds interesting connections with some proposed heuristics in the literature, e.g. optimization by Gaussian continuation, training by diffusion, mollifying networks. Under some mild assumptions on the problem structure, we conduct a theoretical analysis of the proposed algorithm. Our analysis shows that, with a specifically designed scheme for the homotopy parameter, H-SGD enjoys a global linear rate of convergence to a neighborhood of a minimum while maintaining fast and inexpensive iterations. Experimental evaluations confirm the theoretical results and show that H-SGD can outperform standard SGD.

ROOct 21, 2020
An Efficient Real-Time NMPC for Quadrotor Position Control under Communication Time-Delay

Barbara Barros Carlos, Tommaso Sartor, Andrea Zanelli et al.

The advances in computer processor technology have enabled the application of nonlinear model predictive control (NMPC) to agile systems, such as quadrotors. These systems are characterized by their underactuation, nonlinearities, bounded inputs, and time-delays. Classical control solutions fall short in overcoming these difficulties and fully exploiting the capabilities offered by such platforms. This paper presents the design and implementation of an efficient position controller for quadrotors based on real-time NMPC with time-delay compensation and bounds enforcement on the actuators. To deal with the limited computational resources onboard, an offboard control architecture is proposed. It is implemented using the high-performance software package acados, which solves optimal control problems and implements a real-time iteration (RTI) variant of a sequential quadratic programming (SQP) scheme with Gauss-Newton Hessian approximation. The quadratic subproblems (QP) in the SQP scheme are solved with HPIPM, an interior-point method solver, built on top of the linear algebra library BLASFEO, finely tuned for multiple CPU architectures. Solution times are further reduced by reformulating the QPs using the efficient partial condensing algorithm implemented in HPIPM. We demonstrate the capabilities of our architecture using the Crazyflie 2.1 nano-quadrotor.

OCJun 12, 2020
Kernel Distributionally Robust Optimization

Jia-Jie Zhu, Wittawat Jitkrittum, Moritz Diehl et al.

We propose kernel distributionally robust optimization (Kernel DRO) using insights from the robust optimization theory and functional analysis. Our method uses reproducing kernel Hilbert spaces (RKHS) to construct a wide range of convex ambiguity sets, which can be generalized to sets based on integral probability metrics and finite-order moment bounds. This perspective unifies multiple existing robust and stochastic optimization methods. We prove a theorem that generalizes the classical duality in the mathematical problem of moments. Enabled by this theorem, we reformulate the maximization with respect to measures in DRO into the dual program that searches for RKHS functions. Using universal RKHSs, the theorem applies to a broad class of loss functions, lifting common limitations such as polynomial losses and knowledge of the Lipschitz constant. We then establish a connection between DRO and stochastic optimization with expectation constraints. Finally, we propose practical algorithms based on both batch convex solvers and stochastic functional gradient, which apply to general optimization and machine learning tasks.

LGJun 3, 2020
On the Promise of the Stochastic Generalized Gauss-Newton Method for Training DNNs

Matilde Gargiani, Andrea Zanelli, Moritz Diehl et al.

Following early work on Hessian-free methods for deep learning, we study a stochastic generalized Gauss-Newton method (SGN) for training DNNs. SGN is a second-order optimization method, with efficient iterations, that we demonstrate to often require substantially fewer iterations than standard SGD to converge. As the name suggests, SGN uses a Gauss-Newton approximation for the Hessian matrix, and, in order to compute an approximate search direction, relies on the conjugate gradient method combined with forward and reverse automatic differentiation. Despite the success of SGD and its first-order variants, and despite Hessian-free methods based on the Gauss-Newton Hessian approximation having been already theoretically proposed as practical methods for training DNNs, we believe that SGN has a lot of undiscovered and yet not fully displayed potential in big mini-batch scenarios. For this setting, we demonstrate that SGN does not only substantially improve over SGD in terms of the number of iterations, but also in terms of runtime. This is made possible by an efficient, easy-to-use and flexible implementation of SGN we propose in the Theano deep learning platform, which, unlike Tensorflow and Pytorch, supports forward automatic differentiation. This enables researchers to further study and improve this promising optimization technique and hopefully reconsider stochastic second-order methods as competitive optimization techniques for training DNNs; we also hope that the promise of SGN may lead to forward automatic differentiation being added to Tensorflow or Pytorch. Our results also show that in big mini-batch scenarios SGN is more robust than SGD with respect to its hyperparameters (we never had to tune its step-size for our benchmarks!), which eases the expensive process of hyperparameter tuning that is instead crucial for the performance of first-order methods.

OCMar 31, 2020
Worst-Case Risk Quantification under Distributional Ambiguity using Kernel Mean Embedding in Moment Problem

Jia-Jie Zhu, Wittawat Jitkrittum, Moritz Diehl et al.

In order to anticipate rare and impactful events, we propose to quantify the worst-case risk under distributional ambiguity using a recent development in kernel methods -- the kernel mean embedding. Specifically, we formulate the generalized moment problem whose ambiguity set (i.e., the moment constraint) is described by constraints in the associated reproducing kernel Hilbert space in a nonparametric manner. We then present the tractable approximation and its theoretical justification. As a concrete application, we numerically test the proposed method in characterizing the worst-case constraint violation probability in the context of a constrained stochastic control system.

OCJan 28, 2020
A Kernel Mean Embedding Approach to Reducing Conservativeness in Stochastic Programming and Control

Jia-Jie Zhu, Moritz Diehl, Bernhard Schölkopf

We apply kernel mean embedding methods to sample-based stochastic optimization and control. Specifically, we use the reduced-set expansion method as a way to discard sampled scenarios. The effect of such constraint removal is improved optimality and decreased conservativeness. This is achieved by solving a distributional-distance-regularized optimization problem. We demonstrated this optimization formulation is well-motivated in theory, computationally tractable and effective in numerical algorithms.

ROJan 15, 2020
CIAO$^\star$: MPC-based Safe Motion Planning in Predictable Dynamic Environments

Tobias Schoels, Per Rutquist, Luigi Palmieri et al.

Robots have been operating in dynamic environments and shared workspaces for decades. Most optimization based motion planning methods, however, do not consider the movement of other agents, e.g. humans or other robots, and therefore do not guarantee collision avoidance in such scenarios. This paper builds upon the Convex Inner ApprOximation (CIAO) method and proposes a motion planning algorithm that guarantees collision avoidance in predictable dynamic environments. Furthermore, it generalizes CIAO's free region concept to arbitrary norms and proposes a cost function to approximate time optimal motion planning. The proposed method, CIAO$^\star$, finds kinodynamically feasible and collision free trajectories for constrained single body robots using model predictive control (MPC). It optimizes the motion of one agent and accounts for the predicted movement of surrounding agents and obstacles. The experimental evaluation shows that CIAO$^\star$ reaches close to time optimal behavior.

MLNov 25, 2019
A New Distribution-Free Concept for Representing, Comparing, and Propagating Uncertainty in Dynamical Systems with Kernel Probabilistic Programming

Jia-Jie Zhu, Krikamol Muandet, Moritz Diehl et al.

This work presents the concept of kernel mean embedding and kernel probabilistic programming in the context of stochastic systems. We propose formulations to represent, compare, and propagate uncertainties for fairly general stochastic dynamics in a distribution-free manner. The new tools enjoy sound theory rooted in functional analysis and wide applicability as demonstrated in distinct numerical examples. The implication of this new concept is a new mode of thinking about the statistical nature of uncertainty in dynamical systems.

ROSep 18, 2019
An NMPC Approach using Convex Inner Approximations for Online Motion Planning with Guaranteed Collision Avoidance

Tobias Schoels, Luigi Palmieri, Kai O. Arras et al.

Even though mobile robots have been around for decades, trajectory optimization and continuous time collision avoidance remain subject of active research. Existing methods trade off between path quality, computational complexity, and kinodynamic feasibility. This work approaches the problem using a nonlinear model predictive control (NMPC) framework, that is based on a novel convex inner approximation of the collision avoidance constraint. The proposed Convex Inner ApprOximation (CIAO) method finds kinodynamically feasible and continuous time collision free trajectories, in few iterations, typically one. For a feasible initialization, the approach is guaranteed to find a feasible solution, i.e. it preserves feasibility. Our experimental evaluation shows that CIAO outperforms state of the art baselines in terms of planning efficiency and path quality. Experiments on a robot with 12 states show that it also scales to high-dimensional systems. Furthermore real-world experiments demonstrate its capability of unifying trajectory optimization and tracking for safe motion planning in dynamic environments.

SYNov 29, 2017
A Family of Iterative Gauss-Newton Shooting Methods for Nonlinear Optimal Control

Markus Giftthaler, Michael Neunert, Markus Stäuble et al.

This paper introduces a family of iterative algorithms for unconstrained nonlinear optimal control. We generalize the well-known iLQR algorithm to different multiple-shooting variants, combining advantages like straight-forward initialization and a closed-loop forward integration. All algorithms have similar computational complexity, i.e. linear complexity in the time horizon, and can be derived in the same computational framework. We compare the full-step variants of our algorithms and present several simulation examples, including a high-dimensional underactuated robot subject to contact switches. Simulation results show that our multiple-shooting algorithms can achieve faster convergence, better local contraction rates and much shorter runtimes than classical iLQR, which makes them a superior choice for nonlinear model predictive control applications.

OCAug 22, 2015
A quaternion-based model for optimal control of the SkySails airborne wind energy system

Michael Erhard, Greg Horn, Moritz Diehl

Airborne wind energy systems are capable of extracting energy from higher wind speeds at higher altitudes. The configuration considered in this paper is based on a tethered kite flown in a pumping orbit. This pumping cycle generates energy by winching out at high tether forces and driving a generator while flying figures-of-eight, or lemniscates, as crosswind pattern. Then, the tether is reeled in while keeping the kite at a neutral position, thus leaving a net amount of generated energy. In order to achieve an economic operation, optimization of pumping cycles is of great interest. In this paper, first the principles of airborne wind energy will be briefly revisited. The first contribution is a singularity-free model for the tethered kite dynamics in quaternion representation, where the model is derived from first principles. The second contribution is an optimal control formulation and numerical results for complete pumping cycles. Based on the developed model, the setup of the optimal control problem (OCP) is described in detail along with its numerical solution based on the direct multiple shooting method in the CasADi optimization environment. Optimization results for a pumping cycle consisting of six lemniscates show that the approach is capable to find an optimal orbit in a few minutes of computation time. For this optimal orbit, the power output is increased by a factor of two compared to a sophisticated initial guess for the considered test scenario.