NAMay 19
Reliable sampling-based RKHS norm estimation via superconvergenceTizian Wenzel, Abdullah Tokmak, Christian Fiedler
Kernel methods are one of the cornerstones of learning-based control, modern system identification, surrogate modelling, and related fields. A key advantage of this class of learning and function approximation methods is the availability of quantitative error bounds, which in turn play a key role in guaranteeing the safety of learned controllers and related learning-based algorithms. However, these error bounds rely on a particular property of the target function -- its reproducing kernel Hilbert space (RKHS) norm -- which is usually impossible to obtain in practice. Motivated by this severe shortcoming, we present a novel sampling-based RKHS norm estimation approach with a solid theoretical foundation, leveraging very recent advances in the theory of superconvergence in kernel methods. Our method is applicable to a broad range of practically relevant function classes and requires only reasonable prior knowledge about the target function. Extensive numerical experiments demonstrate the efficacy and practical applicability of the proposed method. By providing a reliable RKHS norm estimation approach, we remove a major obstacle to the practical deployment of learning-based control algorithms.
LGSep 2, 2024
PACSBO: Probably approximately correct safe Bayesian optimizationAbdullah Tokmak, Thomas B. Schön, Dominik Baumann
Safe Bayesian optimization (BO) algorithms promise to find optimal control policies without knowing the system dynamics while at the same time guaranteeing safety with high probability. In exchange for those guarantees, popular algorithms require a smoothness assumption: a known upper bound on a norm in a reproducing kernel Hilbert space (RKHS). The RKHS is a potentially infinite-dimensional space, and it is unclear how to, in practice, obtain an upper bound of an unknown function in its corresponding RKHS. In response, we propose an algorithm that estimates an upper bound on the RKHS norm of an unknown function from data and investigate its theoretical properties. Moreover, akin to Lipschitz-based methods, we treat the RKHS norm as a local rather than a global object, and thus reduce conservatism. Integrating the RKHS norm estimation and the local interpretation of the RKHS norm into a safe BO algorithm yields PACSBO, an algorithm for probably approximately correct safe Bayesian optimization, for which we provide numerical and hardware experiments that demonstrate its applicability and benefits over popular safe BO algorithms.
OCDec 12, 2025
Safe Bayesian optimization across noise models via scenario programmingAbdullah Tokmak, Thomas B. Schön, Dominik Baumann
Safe Bayesian optimization (BO) with Gaussian processes is an effective tool for tuning control policies in safety-critical real-world systems, specifically due to its sample efficiency and safety guarantees. However, most safe BO algorithms assume homoscedastic sub-Gaussian measurement noise, an assumption that does not hold in many relevant applications. In this article, we propose a straightforward yet rigorous approach for safe BO across noise models, including homoscedastic sub-Gaussian and heteroscedastic heavy-tailed distributions. We provide a high-probability bound on the measurement noise via the scenario approach, integrate these bounds into high probability confidence intervals, and prove safety and optimality for our proposed safe BO algorithm. We deploy our algorithm in synthetic examples and in tuning a controller for the Franka Emika manipulator in simulation.
SYDec 15, 2023
Automatic nonlinear MPC approximation with closed-loop guaranteesAbdullah Tokmak, Christian Fiedler, Melanie N. Zeilinger et al.
Safety guarantees are vital in many control applications, such as robotics. Model predictive control (MPC) provides a constructive framework for controlling safety-critical systems, but is limited by its computational complexity. We address this problem by presenting a novel algorithm that automatically computes an explicit approximation to nonlinear MPC schemes while retaining closed-loop guarantees. Specifically, the problem can be reduced to a function approximation problem, which we then tackle by proposing ALKIA-X, the Adaptive and Localized Kernel Interpolation Algorithm with eXtrapolated reproducing kernel Hilbert space norm. ALKIA-X is a non-iterative algorithm that ensures numerically well-conditioned computations, a fast-to-evaluate approximating function, and the guaranteed satisfaction of any desired bound on the approximation error. Hence, ALKIA-X automatically computes an explicit function that approximates the MPC, yielding a controller suitable for safety-critical systems and high sampling rates. We apply ALKIA-X to approximate two nonlinear MPC schemes, demonstrating reduced computational demand and applicability to realistic problems.
LGMar 13, 2025
Safe exploration in reproducing kernel Hilbert spacesAbdullah Tokmak, Kiran G. Krishnan, Thomas B. Schön et al.
Popular safe Bayesian optimization (BO) algorithms learn control policies for safety-critical systems in unknown environments. However, most algorithms make a smoothness assumption, which is encoded by a known bounded norm in a reproducing kernel Hilbert space (RKHS). The RKHS is a potentially infinite-dimensional space, and it remains unclear how to reliably obtain the RKHS norm of an unknown function. In this work, we propose a safe BO algorithm capable of estimating the RKHS norm from data. We provide statistical guarantees on the RKHS norm estimation, integrate the estimated RKHS norm into existing confidence intervals and show that we retain theoretical guarantees, and prove safety of the resulting safe BO algorithm. We apply our algorithm to safely optimize reinforcement learning policies on physics simulators and on a real inverted pendulum, demonstrating improved performance, safety, and scalability compared to the state-of-the-art.
SYApr 1
Safe learning-based control via function-based uncertainty quantificationAbdullah Tokmak, Toni Karvonen, Thomas B. Schön et al.
Uncertainty quantification is essential when deploying learning-based control methods in safety-critical systems. This is commonly realized by constructing uncertainty tubes that enclose the unknown function of interest, e.g., the reward and constraint functions or the underlying dynamics model, with high probability. However, existing approaches for uncertainty quantification typically rely on restrictive assumptions on the unknown function, such as known bounds on functional norms or Lipschitz constants, and struggle with discontinuities. In this paper, we model the unknown function as a random function from which independent and identically distributed realizations can be generated, and construct uncertainty tubes via the scenario approach that hold with high probability and rely solely on the sampled realizations. We integrate these uncertainty tubes into a safe Bayesian optimization algorithm, which we then use to safely tune control parameters on a real Furuta pendulum.
SYAug 19, 2025
Towards safe control parameter tuning in distributed multi-agent systemsAbdullah Tokmak, Thomas B. Schön, Dominik Baumann
Many safety-critical real-world problems, such as autonomous driving and collaborative robots, are of a distributed multi-agent nature. To optimize the performance of these systems while ensuring safety, we can cast them as distributed optimization problems, where each agent aims to optimize their parameters to maximize a coupled reward function subject to coupled constraints. Prior work either studies a centralized setting, does not consider safety, or struggles with sample efficiency. Since we require sample efficiency and work with unknown and nonconvex rewards and constraints, we solve this optimization problem using safe Bayesian optimization with Gaussian process regression. Moreover, we consider nearest-neighbor communication between the agents. To capture the behavior of non-neighboring agents, we reformulate the static global optimization problem as a time-varying local optimization problem for each agent, essentially introducing time as a latent variable. To this end, we propose a custom spatio-temporal kernel to integrate prior knowledge. We show the successful deployment of our algorithm in simulations.