Sandra Hirche

h-index40

70papers

1,849citations

Novelty52%

AI Score57

Ranked #18,510 of 201,018 authors (top 9%)#4,158 in LG (top 10%)

70 Papers

LGJun 24, 2022

Physically Consistent Learning of Conservative Lagrangian Systems with Gaussian Processes

Giulio Evangelisti, Sandra Hirche

This paper proposes a physically consistent Gaussian Process (GP) enabling the identification of uncertain Lagrangian systems. The function space is tailored according to the energy components of the Lagrangian and the differential equation structure, analytically guaranteeing physical and mathematical properties such as energy conservation and quadratic form. The novel formulation of Cholesky decomposed matrix kernels allow the probabilistic preservation of positive definiteness. Only differential input-to-output measurements of the function map are required while Gaussian noise is permitted in torques, velocities, and accelerations. We demonstrate the effectiveness of the approach in numerical simulation.

LGJul 4, 2022

Safe Reinforcement Learning via Confidence-Based Filters

Sebastian Curi, Armin Lederer, Sandra Hirche et al.

Ensuring safety is a crucial challenge when deploying reinforcement learning (RL) to real-world systems. We develop confidence-based safety filters, a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard RL techniques, based on probabilistic dynamics models. Our approach is based on a reformulation of state constraints in terms of cost functions, reducing safety verification to a standard RL task. By exploiting the concept of hallucinating inputs, we extend this formulation to determine a "backup" policy that is safe for the unknown system with high probability. Finally, the nominal policy is minimally adjusted at every time step during a roll-out towards the backup policy, such that safe recovery can be guaranteed afterwards. We provide formal safety guarantees, and empirically demonstrate the effectiveness of our approach.

SYMar 31, 2023

Learning-Based Optimal Control with Performance Guarantees for Unknown Systems with Latent States

Robert Lefringhausen, Supitsana Srithasan, Armin Lederer et al.

As control engineering methods are applied to increasingly complex systems, data-driven approaches for system identification appear as a promising alternative to physics-based modeling. While the Bayesian approaches prevalent for safety-critical applications usually rely on the availability of state measurements, the states of a complex system are often not directly measurable. It may then be necessary to jointly estimate the dynamics and the latent state, making the quantification of uncertainties and the design of controllers with formal performance guarantees considerably more challenging. This paper proposes a novel method for the computation of an optimal input trajectory for unknown nonlinear systems with latent states based on a combination of particle Markov chain Monte Carlo methods and scenario theory. Probabilistic performance guarantees are derived for the resulting input trajectory, and an approach to validate the performance of arbitrary control laws is presented. The effectiveness of the proposed method is demonstrated in a numerical simulation.

LGApr 21, 2023

Estimating Motor Symptom Presence and Severity in Parkinson's Disease from Wrist Accelerometer Time Series using ROCKET and InceptionTime

Cedric Donié, Neha Das, Satoshi Endo et al.

Parkinson's disease (PD) is a neurodegenerative condition characterized by frequently changing motor symptoms, necessitating continuous symptom monitoring for more targeted treatment. Classical time series classification and deep learning techniques have demonstrated limited efficacy in monitoring PD symptoms using wearable accelerometer data due to complex PD movement patterns and the small size of available datasets. We investigate InceptionTime and RandOm Convolutional KErnel Transform (ROCKET) as they are promising for PD symptom monitoring. InceptionTime's high learning capacity is well-suited to modeling complex movement patterns, while ROCKET is suited to small datasets. With random search methodology, we identify the highest-scoring InceptionTime architecture and compare its performance to ROCKET with a ridge classifier and a multi-layer perceptron (MLP) on wrist motion data from PD patients. Our findings indicate that all approaches can learn to estimate tremor severity and bradykinesia presence with moderate performance but encounter challenges in detecting dyskinesia. Among the presented approaches, ROCKET demonstrates higher scores in identifying dyskinesia, whereas InceptionTime exhibits slightly better performance in tremor and bradykinesia estimation. Notably, both methods outperform the multi-layer perceptron. In conclusion, InceptionTime can classify complex wrist motion time series and holds potential for continuous symptom monitoring in PD with further development.

LGFeb 23, 2023

Sharp Calibrated Gaussian Processes

Alexandre Capone, Geoff Pleiss, Sandra Hirche

While Gaussian processes are a mainstay for various engineering and scientific applications, the uncertainty estimates don't satisfy frequentist guarantees and can be miscalibrated in practice. State-of-the-art approaches for designing calibrated models rely on inflating the Gaussian process posterior variance, which yields confidence intervals that are potentially too coarse. To remedy this, we present a calibration approach that generates predictive quantiles using a computation inspired by the vanilla Gaussian process posterior variance but using a different set of hyperparameters chosen to satisfy an empirical calibration constraint. This results in a calibration approach that is considerably more flexible than existing approaches, which we optimize to yield tight predictive quantiles. Our approach is shown to yield a calibrated model under reasonable assumptions. Furthermore, it outperforms existing approaches in sharpness when employed for calibrated regression.

SYJul 10, 2023

Episodic Gaussian Process-Based Learning Control with Vanishing Tracking Errors

Armin Lederer, Jonas Umlauft, Sandra Hirche

Due to the increasing complexity of technical systems, accurate first principle models can often not be obtained. Supervised machine learning can mitigate this issue by inferring models from measurement data. Gaussian process regression is particularly well suited for this purpose due to its high data-efficiency and its explicit uncertainty representation, which allows the derivation of prediction error bounds. These error bounds have been exploited to show tracking accuracy guarantees for a variety of control approaches, but their direct dependency on the training data is generally unclear. We address this issue by deriving a Bayesian prediction error bound for GP regression, which we show to decay with the growth of a novel, kernel-based measure of data density. Based on the prediction error bound, we prove time-varying tracking accuracy guarantees for learned GP models used as feedback compensation of unknown nonlinearities, and show to achieve vanishing tracking error with increasing data density. This enables us to develop an episodic approach for learning Gaussian process models, such that an arbitrary tracking accuracy can be guaranteed. The effectiveness of the derived theory is demonstrated in several simulations.

OCJul 23, 2024

Data-Driven Stochastic Optimal Control in Reproducing Kernel Hilbert Spaces

Nicolas Hoischen, Petar Bevanda, Stefan Sosnowski et al.

This paper proposes a fully data-driven approach for optimal control of nonlinear control-affine systems represented by a stochastic diffusion. The focus is on the scenario where both the nonlinear dynamics and stage cost functions are unknown, while only a control penalty function and constraints are provided. To this end, we embed state probability densities into a reproducing kernel Hilbert space (RKHS) to leverage recent advances in operator regression, thereby identifying Markov transition operators associated with controlled diffusion processes. This operator learning approach integrates naturally with convex operator-theoretic Hamilton-Jacobi-Bellman recursions that scale linearly with state dimensionality, effectively solving a wide range of nonlinear optimal control problems. Numerical results demonstrate its ability to address diverse nonlinear control tasks, including the depth regulation of an autonomous underwater vehicle.

SYNov 3, 2023

Safe Online Dynamics Learning with Initially Unknown Models and Infeasible Safety Certificates

Alexandre Capone, Ryan Cosner, Aaron Ames et al.

Safety-critical control tasks with high levels of uncertainty are becoming increasingly common. Typically, techniques that guarantee safety during learning and control utilize constraint-based safety certificates, which can be leveraged to compute safe control inputs. However, excessive model uncertainty can render robust safety certification methods or infeasible, meaning no control input satisfies the constraints imposed by the safety certificate. This paper considers a learning-based setting with a robust safety certificate based on a control barrier function (CBF) second-order cone program. If the control barrier function certificate is feasible, our approach leverages it to guarantee safety. Otherwise, our method explores the system dynamics to collect data and recover the feasibility of the control barrier function constraint. To this end, we employ a method inspired by well-established tools from Bayesian optimization. We show that if the sampling frequency is high enough, we recover the feasibility of the robust CBF certificate, guaranteeing safety. Our approach requires no prior model and corresponds, to the best of our knowledge, to the first algorithm that guarantees safety in settings with occasionally infeasible safety certificates without requiring a backup non-learning-based controller.

SYOct 4, 2023

Online Constraint Tightening in Stochastic Model Predictive Control: A Regression Approach

Alexandre Capone, Tim Brüdigam, Sandra Hirche

Solving chance-constrained stochastic optimal control problems is a significant challenge in control. This is because no analytical solutions exist for up to a handful of special cases. A common and computationally efficient approach for tackling chance-constrained stochastic optimal control problems consists of reformulating the chance constraints as hard constraints with a constraint-tightening parameter. However, in such approaches, the choice of constraint-tightening parameter remains challenging, and guarantees can mostly be obtained assuming that the process noise distribution is known a priori. Moreover, the chance constraints are often not tightly satisfied, leading to unnecessarily high costs. This work proposes a data-driven approach for learning the constraint-tightening parameters online during control. To this end, we reformulate the choice of constraint-tightening parameter for the closed-loop as a binary regression problem. We then leverage a highly expressive \gls{gp} model for binary regression to approximate the smallest constraint-tightening parameters that satisfy the chance constraints. By tuning the algorithm parameters appropriately, we show that the resulting constraint-tightening parameters satisfy the chance constraints up to an arbitrarily small margin with high probability. Our approach yields constraint-tightening parameters that tightly satisfy the chance constraints in numerical experiments, resulting in a lower average cost than three other state-of-the-art approaches.

MLNov 13, 2025

Operator Models for Continuous-Time Offline Reinforcement Learning

Nicolas Hoischen, Petar Bevanda, Max Beier et al.

Continuous-time stochastic processes underlie many natural and engineered systems. In healthcare, autonomous driving, and industrial control, direct interaction with the environment is often unsafe or impractical, motivating offline reinforcement learning from historical data. However, there is limited statistical understanding of the approximation errors inherent in learning policies from offline datasets. We address this by linking reinforcement learning to the Hamilton-Jacobi-Bellman equation and proposing an operator-theoretic algorithm based on a simple dynamic programming recursion. Specifically, we represent our world model in terms of the infinitesimal generator of controlled diffusion processes learned in a reproducing kernel Hilbert space. By integrating statistical learning methods and operator theory, we establish global convergence of the value function and derive finite-sample guarantees with bounds tied to system properties such as smoothness and stability. Our theoretical and numerical results indicate that operator-based approaches may hold promise in solving offline reinforcement learning using continuous-time optimal control.

LGNov 7, 2025

SAD-Flower: Flow Matching for Safe, Admissible, and Dynamically Consistent Planning

Tzu-Yuan Huang, Armin Lederer, Dai-Jie Wu et al.

Flow matching (FM) has shown promising results in data-driven planning. However, it inherently lacks formal guarantees for ensuring state and action constraints, whose satisfaction is a fundamental and crucial requirement for the safety and admissibility of planned trajectories on various systems. Moreover, existing FM planners do not ensure the dynamical consistency, which potentially renders trajectories inexecutable. We address these shortcomings by proposing SAD-Flower, a novel framework for generating Safe, Admissible, and Dynamically consistent trajectories. Our approach relies on an augmentation of the flow with a virtual control input. Thereby, principled guidance can be derived using techniques from nonlinear control theory, providing formal guarantees for state constraints, action constraints, and dynamic consistency. Crucially, SAD-Flower operates without retraining, enabling test-time satisfaction of unseen constraints. Through extensive experiments across several tasks, we demonstrate that SAD-Flower outperforms various generative-model-based baselines in ensuring constraint satisfaction.

SYMay 15

Optimal Delay Compensation in Networked Predictive Control

Severin Beger, Yihui Lin, Katarina Stanojevic et al.

Networked Predictive Control is widely used to mitigate the effect of delays and dropouts in Networked Control Systems, particularly when these exceed the sampling time. A key design choice of these methods is the delay bound, which determines the prediction horizon and the robustness to information loss. This work develops a systematic method to select the optimal bound by quantifying the trade-off between prediction errors and open-loop operation caused by communication losses. Simulation studies demonstrate the performance gains achieved with the optimal bound.

ROSep 18, 2024

Reinforcement Learning with Lie Group Orientations for Robotics

Martin Schuck, Jan Brüdigam, Sandra Hirche et al.

Handling orientations of robots and objects is a crucial aspect of many applications. Yet, ever so often, there is a lack of mathematical correctness when dealing with orientations, especially in learning pipelines involving, for example, artificial neural networks. In this paper, we investigate reinforcement learning with orientations and propose a simple modification of the network's input and output that adheres to the Lie group structure of orientations. As a result, we obtain an easy and efficient implementation that is directly usable with existing learning libraries and achieves significantly better performance than other common orientation representations. We briefly introduce Lie theory specifically for orientations in robotics to motivate and outline our approach. Subsequently, a thorough empirical evaluation of different combinations of orientation representations for states and actions demonstrates the superior performance of our proposed approach in different scenarios, including: direct orientation control, end effector orientation control, and pick-and-place tasks.

SYDec 8, 2025

Learning Dynamics from Infrequent Output Measurements for Uncertainty-Aware Optimal Control

Robert Lefringhausen, Theodor Springer, Sandra Hirche

Reliable optimal control is challenging when the dynamics of a nonlinear system are unknown and only infrequent, noisy output measurements are available. This work addresses this setting of limited sensing by formulating a Bayesian prior over the continuous-time dynamics and latent state trajectory in state-space form and updating it through a targeted marginal Metropolis-Hastings sampler equipped with a numerical ODE integrator. The resulting posterior samples are used to formulate a scenario-based optimal control problem that accounts for both model and measurement uncertainty and is solved using standard nonlinear programming methods. The approach is validated in a numerical case study on glucose regulation using a Type 1 diabetes model.

ROMar 31

Interactive Force-Impedance Control

Fan Shao, Satoshi Endo, Sandra Hirche et al.

Human collaboration with robots requires flexible role adaptation, enabling the robot to switch between an active leader and a passive follower. Effective role switching depends on accurately estimating human intentions, which is typically achieved through external force analysis, nominal robot dynamics, or data-driven approaches. However, these methods are primarily effective in contact-sparse environments. When robots under hybrid or unified force-impedance control physically interact with active humans or non-passive environments, the robotic system may lose passivity and thus compromise safety. To address this challenge, this paper proposes a unified Interactive Force-Impedance Control (IFIC) framework that adapts to interaction power flow, ensuring safe and effortless interaction in contact-rich environments. The proposed control architecture is formulated within a port-Hamiltonian framework, incorporating both interaction and task control ports, thereby guaranteeing autonomous system passivity. Experiments in both rigid and soft contact scenarios demonstrate that IFIC ensures stable collaboration under active human interaction, reduces contact impact forces and interaction force oscillations.

LGSep 25, 2024

Risk-averse learning with delayed feedback

Siyi Wang, Zifan Wang, Karl Henrik Johansson et al.

In real-world scenarios, risk-averse learning is valuable for mitigating potential adverse outcomes. However, the delayed feedback makes it challenging to assess and manage risk effectively. In this paper, we investigate risk-averse learning using Conditional Value at Risk (CVaR) as risk measure, while incorporating feedback with random but bounded delays. We develop two risk-averse learning algorithms that rely on one-point and two-point zeroth-order optimization approaches, respectively. The dynamic regrets of the algorithms are analyzed in terms of the cumulative delay and the number of total samplings. In the absence of delay, the regret bounds match the established bounds of zeroth-order stochastic gradient methods for risk-averse learning. Furthermore, the two-point risk-averse learning outperforms the one-point algorithm by achieving a smaller regret bound. We provide numerical experiments on a dynamic pricing problem to demonstrate the performance of the algorithms.

LGFeb 5, 2024

Whom to Trust? Elective Learning for Distributed Gaussian Process Regression

Zewen Yang, Xiaobing Dai, Akshat Dubey et al.

This paper introduces an innovative approach to enhance distributed cooperative learning using Gaussian process (GP) regression in multi-agent systems (MASs). The key contribution of this work is the development of an elective learning algorithm, namely prior-aware elective distributed GP (Pri-GP), which empowers agents with the capability to selectively request predictions from neighboring agents based on their trustworthiness. The proposed Pri-GP effectively improves individual prediction accuracy, especially in cases where the prior knowledge of an agent is incorrect. Moreover, it eliminates the need for computationally intensive variance calculations for determining aggregation weights in distributed GP. Furthermore, we establish a prediction error bound within the Pri-GP framework, ensuring the reliability of predictions, which is regarded as a crucial property in safety-critical MAS applications.

MAFeb 5, 2024

Cooperative Learning with Gaussian Processes for Euler-Lagrange Systems Tracking Control under Switching Topologies

Zewen Yang, Songbo Dong, Armin Lederer et al.

This work presents an innovative learning-based approach to tackle the tracking control problem of Euler-Lagrange multi-agent systems with partially unknown dynamics operating under switching communication topologies. The approach leverages a correlation-aware cooperative algorithm framework built upon Gaussian process regression, which adeptly captures inter-agent correlations for uncertainty predictions. A standout feature is its exceptional efficiency in deriving the aggregation weights achieved by circumventing the computationally intensive posterior variance calculations. Through Lyapunov stability analysis, the distributed control law ensures bounded tracking errors with high probability. Simulation experiments validate the protocol's efficacy in effectively managing complex scenarios, establishing it as a promising solution for robust tracking control in multi-agent systems characterized by uncertain dynamics and dynamic communication structures.

SYMay 12, 2024

Nonparametric Control Koopman Operators

Petar Bevanda, Bas Driessen, Lucian Cristian Iacob et al.

This paper presents a novel Koopman composition operator representation framework for control systems in reproducing kernel Hilbert spaces (RKHSs) that is free of explicit dictionary or input parametrizations. By establishing fundamental equivalences between different model representations, we are able to close the gap of control system operator learning and infinite-dimensional regression, enabling various empirical estimators and the connection to the well-understood learning theory in RKHSs under one unified framework. Consequently, our proposed framework allows for arbitrarily accurate finite-rank approximations in infinite-dimensional spaces and leads to finite-dimensional predictors without apriori restrictions to a finite span of functions or inputs. To enable applications to high-dimensional control systems, we improve the scalability of our proposed control Koopman operator estimates by utilizing sketching techniques. Numerical experiments demonstrate superior prediction accuracy compared to bilinear EDMD, especially in high dimensions. Finally, we show that our learned models are readily interfaced with linear-parameter-varying techniques for model predictive control.

LGFeb 10, 2025

Koopman-Equivariant Gaussian Processes

Petar Bevanda, Max Beier, Armin Lederer et al.

Credible forecasting and representation learning of dynamical systems are of ever-increasing importance for reliable decision-making. To that end, we propose a family of Gaussian processes (GP) for dynamical systems with linear time-invariant responses, which are nonlinear only in initial conditions. This linearity allows us to tractably quantify forecasting and representational uncertainty, simultaneously alleviating the challenge of computing the distribution of trajectories from a GP-based dynamical system and enabling a new probabilistic treatment of learning Koopman operator representations. Using a trajectory-based equivariance -- which we refer to as \textit{Koopman equivariance} -- we obtain a GP model with enhanced generalization capabilities. To allow for large-scale regression, we equip our framework with variational inference based on suitable inducing points. Experiments demonstrate on-par and often better forecasting performance compared to kernel-based methods for learning dynamical systems.

SYMay 14, 2024

Stable Inverse Reinforcement Learning: Policies from Control Lyapunov Landscapes

Samuel Tesfazgi, Leonhard Sprandl, Armin Lederer et al.

Learning from expert demonstrations to flexibly program an autonomous system with complex behaviors or to predict an agent's behavior is a powerful tool, especially in collaborative control settings. A common method to solve this problem is inverse reinforcement learning (IRL), where the observed agent, e.g., a human demonstrator, is assumed to behave according to the optimization of an intrinsic cost function that reflects its intent and informs its control actions. While the framework is expressive, it is also computationally demanding and generally lacks convergence guarantees. We therefore propose a novel, stability-certified IRL approach by reformulating the cost function inference problem to learning control Lyapunov functions (CLF) from demonstrations data. By additionally exploiting closed-form expressions for associated control policies, we are able to efficiently search the space of CLFs by observing the attractor landscape of the induced dynamics. For the construction of the inverse optimal CLFs, we use a Sum of Squares and formulate a convex optimization problem. We present a theoretical analysis of the optimality properties provided by the CLF and evaluate our approach using both simulated and real-world data.

SYFeb 5, 2024

Decentralized Event-Triggered Online Learning for Safe Consensus of Multi-Agent Systems with Gaussian Process Regression

Xiaobing Dai, Zewen Yang, Mengtian Xu et al.

Consensus control in multi-agent systems has received significant attention and practical implementation across various domains. However, managing consensus control under unknown dynamics remains a significant challenge for control design due to system uncertainties and environmental disturbances. This paper presents a novel learning-based distributed control law, augmented by an auxiliary dynamics. Gaussian processes are harnessed to compensate for the unknown components of the multi-agent system. For continuous enhancement in predictive performance of Gaussian process model, a data-efficient online learning strategy with a decentralized event-triggered mechanism is proposed. Furthermore, the control performance of the proposed approach is ensured via the Lyapunov theory, based on a probabilistic guarantee for prediction error bounds. To demonstrate the efficacy of the proposed learning-based controller, a comparative analysis is conducted, contrasting it with both conventional distributed control laws and offline learning methodologies.

SYApr 10

On the Existence of Quadratic Control Lyapunov Functions for Koopman-Operator based Bilinear Systems

Sami Leon Noel Aziz Hanna, Nicolas Hoischen, Sandra Hirche et al.

Koopman operator-based methods enable data-driven bilinear representations of unknown nonlinear control systems. Accurate representations often demand significantly higher dimensions than the original system, making control design challenging. Control Lyapunov Functions (CLFs) are widely used for controller synthesis, with quadratic CLF candidates being the most common due to their simplicity. Yet, we show that this class is highly restrictive, especially when the state dimension is large: under mild conditions, their existence implies stabilizability of the bilinear system by a constant input -- that is, the control remains fixed over time. We establish this result by formulating a quadratically constrained quadratic program (QCQP) that exactly characterizes valid CLFs. Since QCQPs are NP-hard, we propose a convex semidefinite relaxation that offers a sufficient validity condition. For single-input systems, we prove that a quadratic CLF requires constant control stabilizability, and empirically demonstrate that this extends to high-dimensional multi-input systems in many cases.

OCDec 2, 2024

Kernel-Based Optimal Control: An Infinitesimal Generator Approach

Petar Bevanda, Nicolas Hoischen, Tobias Wittmann et al.

This paper presents a novel operator-theoretic approach for optimal control of nonlinear stochastic systems within reproducing kernel Hilbert spaces. Our learning framework leverages data samples of system dynamics and stage cost functions, with only control penalties and constraints provided. The proposed method directly learns the infinitesimal generator of a controlled stochastic diffusion in an infinite-dimensional hypothesis space. We demonstrate that our approach seamlessly integrates with modern convex operator-theoretic Hamilton-Jacobi-Bellman recursions, enabling a data-driven solution to the optimal control problems. Furthermore, our learning framework includes nonparametric estimators for uncontrolled infinitesimal generators as a special case. Numerical experiments, ranging from synthetic differential equations to simulated robotic systems, showcase the advantages of our approach compared to both modern data-driven and classical nonlinear programming methods for optimal control.

SYMar 31

Where to Put Safety? Control Barrier Function Placement in Networked Control Systems

Severin Beger, Yuling Chen, Sandra Hirche

Ensuring safe behavior is critical for modern autonomous cyber-physical systems. Control barrier functions (CBFs) are widely used to enforce safety in autonomous systems, yet their placement within networked control architectures remains largely unexplored. In this work, we investigate where to enforce safety in a networked control system in which a remote model predictive controller (MPC) communicates with the plant over a delayed network. We compare two safety strategies: i) a local myopic CBF filter applied at the plant and ii) predictive CBF constraints embedded in the remote MPC. For both architectures, we derive state-dependent disturbance tolerance bounds and show that safety placement induces a fundamental trade-off: local CBFs provide higher disturbance tolerance due to access to fresh state measurements, whereas MPC-CBF enables improved performance through anticipatory behavior, but yields stricter admissible disturbance levels. Motivated by this insight, we propose a combined architecture that integrates predictive and local safety mechanisms. The theoretical findings are illustrated in simulations on a planar three-degree-of-freedom robot performing a collision-avoidance task.

LGDec 16, 2024

Asynchronous Distributed Gaussian Process Regression for Online Learning and Dynamical Systems: Complementary Document

Zewen Yang, Xiaobing Dai, Sandra Hirche

This is a complementary document for the paper titled "Asynchronous Distributed Gaussian Process Regression for Online Learning and Dynamical Systems".

SYApr 3, 2024

Risk-averse Learning with Non-Stationary Distributions

Siyi Wang, Zifan Wang, Xinlei Yi et al.

Considering non-stationary environments in online optimization enables decision-maker to effectively adapt to changes and improve its performance over time. In such cases, it is favorable to adopt a strategy that minimizes the negative impact of change to avoid potentially risky situations. In this paper, we investigate risk-averse online optimization where the distribution of the random cost changes over time. We minimize risk-averse objective function using the Conditional Value at Risk (CVaR) as risk measure. Due to the difficulty in obtaining the exact CVaR gradient, we employ a zeroth-order optimization approach that queries the cost function values multiple times at each iteration and estimates the CVaR gradient using the sampled values. To facilitate the regret analysis, we use a variation metric based on Wasserstein distance to capture time-varying distributions. Given that the distribution variation is sub-linear in the total number of episodes, we show that our designed learning algorithm achieves sub-linear dynamic regret with high probability for both convex and strongly convex functions. Moreover, theoretical results suggest that increasing the number of samples leads to a reduction in the dynamic regret bounds until the sampling number reaches a specific limit. Finally, we provide numerical experiments of dynamic pricing in a parking lot to illustrate the efficacy of the designed algorithm.

LGApr 3, 2025

Learning Geometrically-Informed Lyapunov Functions with Deep Diffeomorphic RBF Networks

Samuel Tesfazgi, Leonhard Sprandl, Sandra Hirche

The practical deployment of learning-based autonomous systems would greatly benefit from tools that flexibly obtain safety guarantees in the form of certificate functions from data. While the geometrical properties of such certificate functions are well understood, synthesizing them using machine learning techniques still remains a challenge. To mitigate this issue, we propose a diffeomorphic function learning framework where prior structural knowledge of the desired output is encoded in the geometry of a simple surrogate function, which is subsequently augmented through an expressive, topology-preserving state-space transformation. Thereby, we achieve an indirect function approximation framework that is guaranteed to remain in the desired hypothesis space. To this end, we introduce a novel approach to construct diffeomorphic maps based on RBF networks, which facilitate precise, local transformations around data. Finally, we demonstrate our approach by learning diffeomorphic Lyapunov functions from real-world data and apply our method to different attractor systems.

SYApr 2, 2025

Barrier Certificates for Unknown Systems with Latent States and Polynomial Dynamics using Bayesian Inference

Robert Lefringhausen, Sami Leon Noel Aziz Hanna, Elias August et al.

Certifying safety in dynamical systems is crucial, but barrier certificates - widely used to verify that system trajectories remain within a safe region - typically require explicit system models. When dynamics are unknown, data-driven methods can be used instead, yet obtaining a valid certificate requires rigorous uncertainty quantification. For this purpose, existing methods usually rely on full-state measurements, limiting their applicability. This paper proposes a novel approach for synthesizing barrier certificates for unknown systems with latent states and polynomial dynamics. A Bayesian framework is employed, where a prior in state-space representation is updated using output data via a targeted marginal Metropolis-Hastings sampler. The resulting samples are used to construct a barrier certificate through a sum-of-squares program. Probabilistic guarantees for its validity with respect to the true, unknown system are obtained by testing on an additional set of posterior samples. The approach and its probabilistic guarantees are illustrated through a numerical simulation.

ROMay 14, 2024

Data-driven Force Observer for Human-Robot Interaction with Series Elastic Actuators using Gaussian Processes

Samuel Tesfazgi, Markus Keßler, Emilio Trigili et al.

Ensuring safety and adapting to the user's behavior are of paramount importance in physical human-robot interaction. Thus, incorporating elastic actuators in the robot's mechanical design has become popular, since it offers intrinsic compliance and additionally provide a coarse estimate for the interaction force by measuring the deformation of the elastic components. While observer-based methods have been shown to improve these estimates, they rely on accurate models of the system, which are challenging to obtain in complex operating environments. In this work, we overcome this issue by learning the unknown dynamics components using Gaussian process (GP) regression. By employing the learned model in a Bayesian filtering framework, we improve the estimation accuracy and additionally obtain an observer that explicitly considers local model uncertainty in the confidence measure of the state estimate. Furthermore, we derive guaranteed estimation error bounds, thus, facilitating the use in safety-critical applications. We demonstrate the effectiveness of the proposed approach experimentally in a human-exoskeleton interaction scenario.

LGApr 17, 2024

Predictive Model Development to Identify Failed Healing in Patients after Non-Union Fracture Surgery

Cedric Donié, Marie K. Reumann, Tony Hartung et al.

Bone non-union is among the most severe complications associated with trauma surgery, occurring in 10-30% of cases after long bone fractures. Treating non-unions requires a high level of surgical expertise and often involves multiple revision surgeries, sometimes even leading to amputation. Thus, more accurate prognosis is crucial for patient well-being. Recent advances in machine learning (ML) hold promise for developing models to predict non-union healing, even when working with smaller datasets, a commonly encountered challenge in clinical domains. To demonstrate the effectiveness of ML in identifying candidates at risk of failed non-union healing, we applied three ML models (logistic regression, support vector machine, and XGBoost) to the clinical dataset TRUFFLE, which includes 797 patients with long bone non-union. The models provided prediction results with 70% sensitivity, and the specificities of 66% (XGBoost), 49% (support vector machine), and 43% (logistic regression). These findings offer valuable clinical insights because they enable early identification of patients at risk of failed non-union healing after the initial surgical revision treatment protocol.

LGMay 25, 2023

Koopman Kernel Regression

Petar Bevanda, Max Beier, Armin Lederer et al.

Many machine learning approaches for decision making, such as reinforcement learning, rely on simulators or predictive models to forecast the time-evolution of quantities of interest, e.g., the state of an agent or the reward of a policy. Forecasts of such complex phenomena are commonly described by highly nonlinear dynamical systems, making their use in optimization-based decision-making challenging. Koopman operator theory offers a beneficial paradigm for addressing this problem by characterizing forecasts via linear time-invariant (LTI) ODEs, turning multi-step forecasts into sparse matrix multiplication. Though there exists a variety of learning approaches, they usually lack crucial learning-theoretic guarantees, making the behavior of the obtained models with increasing data and dimensionality unclear. We address the aforementioned by deriving a universal Koopman-invariant reproducing kernel Hilbert space (RKHS) that solely spans transformations into LTI dynamical systems. The resulting Koopman Kernel Regression (KKR) framework enables the use of statistical learning tools from function approximation for novel convergence results and generalization error bounds under weaker assumptions than existing work. Our experiments demonstrate superior forecasting performance compared to Koopman operator and sequential data predictors in RKHS.

SYMay 14, 2023

Can Learning Deteriorate Control? Analyzing Computational Delays in Gaussian Process-Based Event-Triggered Online Learning

Xiaobing Dai, Armin Lederer, Zewen Yang et al.

When the dynamics of systems are unknown, supervised machine learning techniques are commonly employed to infer models from data. Gaussian process (GP) regression is a particularly popular learning method for this purpose due to the existence of prediction error bounds. Moreover, GP models can be efficiently updated online, such that event-triggered online learning strategies can be pursued to ensure specified tracking accuracies. However, existing trigger conditions must be able to be evaluated at arbitrary times, which cannot be achieved in practice due to non-negligible computation times. Therefore, we first derive a delay-aware tracking error bound, which reveals an accuracy-delay trade-off. Based on this result, we propose a novel event trigger for GP-based online learning with computational delays, which we show to offer advantages over offline trained GP models for sufficiently small computation times. Finally, we demonstrate the effectiveness of the proposed event trigger for online learning in simulations.

SYFeb 23, 2022

Networked Online Learning for Control of Safety-Critical Resource-Constrained Systems based on Gaussian Processes

Armin Lederer, Mingmin Zhang, Samuel Tesfazgi et al.

Safety-critical technical systems operating in unknown environments require the ability to quickly adapt their behavior, which can be achieved in control by inferring a model online from the data stream generated during operation. Gaussian process-based learning is particularly well suited for safety-critical applications as it ensures bounded prediction errors. While there exist computationally efficient approximations for online inference, these approaches lack guarantees for the prediction error and have high memory requirements, and are therefore not applicable to safety-critical systems with tight memory constraints. In this work, we propose a novel networked online learning approach based on Gaussian process regression, which addresses the issue of limited local resources by employing remote data management in the cloud. Our approach formally guarantees a bounded tracking error with high probability, which is exploited to identify the most relevant data to achieve a certain control performance. We further propose an effective data transmission scheme between the local system and the cloud taking bandwidth limitations and time delay of the transmission channel into account. The effectiveness of the proposed method is successfully demonstrated in a simulation.

SYJan 27, 2022

Towards Data-driven LQR with Koopmanizing Flows

Petar Bevanda, Max Beier, Shahab Heshmati-Alamdari et al.

We propose a novel framework for learning linear time-invariant (LTI) models for a class of continuous-time non-autonomous nonlinear dynamics based on a representation of Koopman operators. In general, the operator is infinite-dimensional but, crucially, linear. To utilize it for efficient LTI control design, we learn a finite representation of the Koopman operator that is linear in controls while concurrently learning meaningful lifting coordinates. For the latter, we rely on Koopmanizing Flows - a diffeomorphism-based representation of Koopman operators and extend it to systems with linear control entry. With such a learned model, we can replace the nonlinear optimal control problem with quadratic cost to that of a linear quadratic regulator (LQR), facilitating efficacious optimal control for nonlinear systems. The superior control performance of the proposed method is demonstrated on simulation examples.

LGDec 10, 2021

Structure-Preserving Learning Using Gaussian Processes and Variational Integrators

Jan Brüdigam, Martin Schuck, Alexandre Capone et al.

Gaussian process regression is increasingly applied for learning unknown dynamical systems. In particular, the implicit quantification of the uncertainty of the learned model makes it a promising approach for safety-critical applications. When using Gaussian process regression to learn unknown systems, a commonly considered approach consists of learning the residual dynamics after applying some generic discretization technique, which might however disregard properties of the underlying physical system. Variational integrators are a less common yet promising approach to discretization, as they retain physical properties of the underlying system, such as energy conservation and satisfaction of explicit kinematic constraints. In this work, we present a novel structure-preserving learning-based modelling approach that combines a variational integrator for the nominal dynamics of a mechanical system and learning residual dynamics with Gaussian process regression. We extend our approach to systems with known kinematic constraints and provide formal bounds on the prediction uncertainty. The simulative evaluation of the proposed method shows desirable energy conservation properties in accordance with general theoretical results and demonstrates exact constraint satisfaction for constrained dynamical systems.

LGDec 8, 2021

Diffeomorphically Learning Stable Koopman Operators

Petar Bevanda, Max Beier, Sebastian Kerz et al.

System representations inspired by the infinite-dimensional Koopman operator (generator) are increasingly considered for predictive modeling. Due to the operator's linearity, a range of nonlinear systems admit linear predictor representations - allowing for simplified prediction, analysis and control. However, finding meaningful finite-dimensional representations for prediction is difficult as it involves determining features that are both Koopman-invariant (evolve linearly under the dynamics) as well as relevant (spanning the original state) - a generally unsupervised problem. In this work, we present Koopmanizing Flows - a novel continuous-time framework for supervised learning of linear predictors for a class of nonlinear dynamics. In our model construction a latent diffeomorphically related linear system unfolds into a linear predictor through the composition with a monomial basis. The lifting, its linear dynamics and state reconstruction are learned simultaneously, while an unconstrained parameterization of Hurwitz matrices ensures asymptotic stability regardless of the operator approximation accuracy. The superior efficacy of Koopmanizing Flows is demonstrated in comparison to a state-of-the-art method on the well-known LASA handwriting benchmark.

SPNov 5, 2021

Adaptive Low-Pass Filtering using Sliding Window Gaussian Processes

Alejandro J. Ordóñez-Conejo, Armin Lederer, Sandra Hirche

When signals are measured through physical sensors, they are perturbed by noise. To reduce noise, low-pass filters are commonly employed in order to attenuate high frequency components in the incoming signal, regardless if they come from noise or the actual signal. Therefore, low-pass filters must be carefully tuned in order to avoid significant deterioration of the signal. This tuning requires prior knowledge about the signal, which is often not available in applications such as reinforcement learning or learning-based control. In order to overcome this limitation, we propose an adaptive low-pass filter based on Gaussian process regression. By considering a constant window of previous observations, updates and predictions fast enough for real-world filtering applications can be realized. Moreover, the online optimization of hyperparameters leads to an adaptation of the low-pass behavior, such that no prior tuning is necessary. We show that the estimation error of the proposed method is uniformly bounded, and demonstrate the flexibility and efficiency of the approach in several simulations.

LGOct 15, 2021

Learning the Koopman Eigendecomposition: A Diffeomorphic Approach

Petar Bevanda, Johannes Kirmayr, Stefan Sosnowski et al.

We present a novel data-driven approach for learning linear representations of a class of stable nonlinear systems using Koopman eigenfunctions. By learning the conjugacy map between a nonlinear system and its Jacobian linearization through a Normalizing Flow one can guarantee the learned function is a diffeomorphism. Using this diffeomorphism, we construct eigenfunctions of the nonlinear system via the spectral equivalence of conjugate systems - allowing the construction of linear predictors for nonlinear systems. The universality of the diffeomorphism learner leads to the universal approximation of the nonlinear system's Koopman eigenfunctions. The developed method is also safe as it guarantees the model is asymptotically stable regardless of the representation accuracy. To our best knowledge, this is the first work to close the gap between the operator, system and learning theories. The efficacy of our approach is shown through simulation examples.

LGOct 1, 2021

Personalized Rehabilitation Robotics based on Online Learning Control

Samuel Tesfazgi, Armin Lederer, Johannes F. Kunz et al.

The use of rehabilitation robotics in clinical applications gains increasing importance, due to therapeutic benefits and the ability to alleviate labor-intensive works. However, their practical utility is dependent on the deployment of appropriate control algorithms, which adapt the level of task-assistance according to each individual patient's need. Generally, the required personalization is achieved through manual tuning by clinicians, which is cumbersome and error-prone. In this work we propose a novel online learning control architecture, which is able to personalize the control force at run time to each individual user. To this end, we deploy Gaussian process-based online learning with previously unseen prediction and update rates. Finally, we evaluate our method in an experimental user study, where the learning controller is shown to provide personalized control, while also obtaining safe interaction forces.

ROSep 15, 2021

Linear-Time Contact and Friction Dynamics in Maximal Coordinates using Variational Integrators

Jan Brüdigam, Jana Janeva, Stefan Sosnowski et al.

Simulation of contact and friction dynamics is an important basis for control- and learning-based algorithms. However, the numerical difficulties of contact interactions pose a challenge for robust and efficient simulators. A maximal-coordinate representation of the dynamics enables efficient solving algorithms, but current methods in maximal coordinates require constraint stabilization schemes. Therefore, we propose an interior-point algorithm for the numerically robust treatment of rigid-body dynamics with contact interactions in maximal coordinates. Additionally, we discretize the dynamics with a variational integrator to prevent constraint drift. Our algorithm achieves linear-time complexity both in the number of contact points and the number of bodies, which is shown theoretically and demonstrated with an implementation. Furthermore, we simulate two robotic systems to highlight the applicability of the proposed algorithm.

LGSep 6, 2021

Gaussian Process Uniform Error Bounds with Unknown Hyperparameters for Safety-Critical Applications

Alexandre Capone, Armin Lederer, Sandra Hirche

Gaussian processes have become a promising tool for various safety-critical settings, since the posterior variance can be used to directly estimate the model error and quantify risk. However, state-of-the-art techniques for safety-critical settings hinge on the assumption that the kernel hyperparameters are known, which does not apply in general. To mitigate this, we introduce robust Gaussian process uniform error bounds in settings with unknown hyperparameters. Our approach computes a confidence region in the space of hyperparameters, which enables us to obtain a probabilistic upper bound for the model error of a Gaussian process with arbitrary hyperparameters. We do not require to know any bounds for the hyperparameters a priori, which is an assumption commonly found in related work. Instead, we are able to derive bounds from data in an intuitive fashion. We additionally employ the proposed technique to derive performance guarantees for a class of learning-based control problems. Experiments show that the bound performs significantly better than vanilla and fully Bayesian Gaussian processes.

ROJul 30, 2021

Distributed Event- and Self-Triggered Coverage Control with Speed Constrained Unicycle Robots

Yuni Zhou, Lingxuan Kong, Stefan Sosnowski et al.

Voronoi coverage control is a particular problem of importance in the area of multi-robot systems, which considers a network of multiple autonomous robots, tasked with optimally covering a large area. This is a common task for fleets of fixed-wing Unmanned Aerial Vehicles (UAVs), which are described in this work by a unicycle model with constant forward-speed constraints. We develop event-based control/communication algorithms to relax the resource requirements on wireless communication and control actuators, an important feature for battery-driven or otherwise energy-constrained systems. To overcome the drawback that the event-triggered algorithm requires continuous measurement of system states, we propose a self-triggered algorithm to estimate the next triggering time. Hardware experiments illustrate the theoretical results.

LGJun 20, 2021

FedXGBoost: Privacy-Preserving XGBoost for Federated Learning

Nhan Khanh Le, Yang Liu, Quang Minh Nguyen et al.

Federated learning is the distributed machine learning framework that enables collaborative training across multiple parties while ensuring data privacy. Practical adaptation of XGBoost, the state-of-the-art tree boosting framework, to federated learning remains limited due to high cost incurred by conventional privacy-preserving methods. To address the problem, we propose two variants of federated XGBoost with privacy guarantee: FedXGBoost-SMM and FedXGBoost-LDP. Our first protocol FedXGBoost-SMM deploys enhanced secure matrix multiplication method to preserve privacy with lossless accuracy and lower overhead than encryption-based techniques. Developed independently, the second protocol FedXGBoost-LDP is heuristically designed with noise perturbation for local differential privacy, and empirically evaluated on real-world and synthetic datasets.

ROMay 25, 2021

Gaussian Process-based Stochastic Model Predictive Control for Overtaking in Autonomous Racing

Tim Brüdigam, Alexandre Capone, Sandra Hirche et al.

A fundamental aspect of racing is overtaking other race cars. Whereas previous research on autonomous racing has majorly focused on lap-time optimization, here, we propose a method to plan overtaking maneuvers in autonomous racing. A Gaussian process is used to learn the behavior of the leading vehicle. Based on the outputs of the Gaussian process, a stochastic Model Predictive Control algorithm plans optimistic trajectories, such that the controlled autonomous race car is able to overtake the leading vehicle. The proposed method is tested in a simple simulation scenario.

LGApr 9, 2021

Inverse Reinforcement Learning: A Control Lyapunov Approach

Samuel Tesfazgi, Armin Lederer, Sandra Hirche

Inferring the intent of an intelligent agent from demonstrations and subsequently predicting its behavior, is a critical task in many collaborative settings. A common approach to solve this problem is the framework of inverse reinforcement learning (IRL), where the observed agent, e.g., a human demonstrator, is assumed to behave according to an intrinsic cost function that reflects its intent and informs its control actions. In this work, we reformulate the IRL inference problem to learning control Lyapunov functions (CLF) from demonstrations by exploiting the inverse optimality property, which states that every CLF is also a meaningful value function. Moreover, the derived CLF formulation directly guarantees stability of inferred control policies. We show the flexibility of our proposed method by learning from goal-directed movement demonstrations in a continuous environment.

ROApr 9, 2021

Distributed Bayesian Online Learning for Cooperative Manipulation

Pablo Budde gen. Dohmann, Armin Lederer, Marcel Dißemond et al.

For tasks where the dynamics of multiple agents are physically coupled, e.g., in cooperative manipulation, the coordination between the individual agents becomes crucial, which requires exact knowledge of the interaction dynamics. This problem is typically addressed using centralized estimators, which can negatively impact the flexibility and robustness of the overall system. To overcome this shortcoming, we propose a novel distributed learning framework for the exemplary task of cooperative manipulation using Bayesian principles. Using only local state information each agent obtains an estimate of the object dynamics and grasp kinematics. These local estimates are combined using dynamic average consensus. Due to the strong probabilistic foundation of the method, each estimate of the object dynamics and grasp kinematics is accompanied by a measure of uncertainty, which allows to guarantee a bounded prediction error with high probability. Moreover, the Bayesian principles directly allow iterative learning with constant complexity, such that the proposed learning method can be used online in real-time applications. The effectiveness of the approach is demonstrated in a simulated cooperative manipulation task.

LGJan 13, 2021

Uniform Error and Posterior Variance Bounds for Gaussian Process Regression with Application to Safe Control

Armin Lederer, Jonas Umlauft, Sandra Hirche

In application areas where data generation is expensive, Gaussian processes are a preferred supervised learning model due to their high data-efficiency. Particularly in model-based control, Gaussian processes allow the derivation of performance guarantees using probabilistic model error bounds. To make these approaches applicable in practice, two open challenges must be solved i) Existing error bounds rely on prior knowledge, which might not be available for many real-world tasks. (ii) The relationship between training data and the posterior variance, which mainly drives the error bound, is not well understood and prevents the asymptotic analysis. This article addresses these issues by presenting a novel uniform error bound using Lipschitz continuity and an analysis of the posterior variance function for a large class of kernels. Additionally, we show how these results can be used to guarantee safe control of an unknown dynamical system and provide numerical illustration examples.

SYNov 20, 2020

The Impact of Data on the Stability of Learning-Based Control- Extended Version

Armin Lederer, Alexandre Capone, Thomas Beckers et al.

Despite the existence of formal guarantees for learning-based control approaches, the relationship between data and control performance is still poorly understood. In this paper, we propose a Lyapunov-based measure for quantifying the impact of data on the certifiable control performance. By modeling unknown system dynamics through Gaussian processes, we can determine the interrelation between model uncertainty and satisfaction of stability conditions. This allows us to directly asses the impact of data on the provable stationary control performance, and thereby the value of the data for the closed-loop system performance. Our approach is applicable to a wide variety of unknown nonlinear systems that are to be controlled by a generic learning-based control law, and the results obtained in numerical simulations indicate the efficacy of the proposed measure.

LGOct 6, 2020

Deep Learning based Uncertainty Decomposition for Real-time Control

Neha Das, Jonas Umlauft, Armin Lederer et al.

Data-driven control in unknown environments requires a clear understanding of the involved uncertainties for ensuring safety and efficient exploration. While aleatoric uncertainty that arises from measurement noise can often be explicitly modeled given a parametric description, it can be harder to model epistemic uncertainty, which describes the presence or absence of training data. The latter can be particularly useful for implementing exploratory control strategies when system dynamics are unknown. We propose a novel method for detecting the absence of training data using deep learning, which gives a continuous valued scalar output between $0$ (indicating low uncertainty) and $1$ (indicating high uncertainty). We utilize this detector as a proxy for epistemic uncertainty and show its advantages over existing approaches on synthetic and real-world datasets. Our approach can be directly combined with aleatoric uncertainty estimates and allows for uncertainty estimation in real-time as the inference is sample-free unlike existing approaches for uncertainty modeling. We further demonstrate the practicality of this uncertainty estimate in deploying online data-efficient control on a simulated quadcopter acted upon by an unknown disturbance model.