OCJul 23, 2024
Data-Driven Stochastic Optimal Control in Reproducing Kernel Hilbert SpacesNicolas Hoischen, Petar Bevanda, Stefan Sosnowski et al.
This paper proposes a fully data-driven approach for optimal control of nonlinear control-affine systems represented by a stochastic diffusion. The focus is on the scenario where both the nonlinear dynamics and stage cost functions are unknown, while only a control penalty function and constraints are provided. To this end, we embed state probability densities into a reproducing kernel Hilbert space (RKHS) to leverage recent advances in operator regression, thereby identifying Markov transition operators associated with controlled diffusion processes. This operator learning approach integrates naturally with convex operator-theoretic Hamilton-Jacobi-Bellman recursions that scale linearly with state dimensionality, effectively solving a wide range of nonlinear optimal control problems. Numerical results demonstrate its ability to address diverse nonlinear control tasks, including the depth regulation of an autonomous underwater vehicle.
MLNov 13, 2025
Operator Models for Continuous-Time Offline Reinforcement LearningNicolas Hoischen, Petar Bevanda, Max Beier et al.
Continuous-time stochastic processes underlie many natural and engineered systems. In healthcare, autonomous driving, and industrial control, direct interaction with the environment is often unsafe or impractical, motivating offline reinforcement learning from historical data. However, there is limited statistical understanding of the approximation errors inherent in learning policies from offline datasets. We address this by linking reinforcement learning to the Hamilton-Jacobi-Bellman equation and proposing an operator-theoretic algorithm based on a simple dynamic programming recursion. Specifically, we represent our world model in terms of the infinitesimal generator of controlled diffusion processes learned in a reproducing kernel Hilbert space. By integrating statistical learning methods and operator theory, we establish global convergence of the value function and derive finite-sample guarantees with bounds tied to system properties such as smoothness and stability. Our theoretical and numerical results indicate that operator-based approaches may hold promise in solving offline reinforcement learning using continuous-time optimal control.
SYApr 10
On the Existence of Quadratic Control Lyapunov Functions for Koopman-Operator based Bilinear SystemsSami Leon Noel Aziz Hanna, Nicolas Hoischen, Sandra Hirche et al.
Koopman operator-based methods enable data-driven bilinear representations of unknown nonlinear control systems. Accurate representations often demand significantly higher dimensions than the original system, making control design challenging. Control Lyapunov Functions (CLFs) are widely used for controller synthesis, with quadratic CLF candidates being the most common due to their simplicity. Yet, we show that this class is highly restrictive, especially when the state dimension is large: under mild conditions, their existence implies stabilizability of the bilinear system by a constant input -- that is, the control remains fixed over time. We establish this result by formulating a quadratically constrained quadratic program (QCQP) that exactly characterizes valid CLFs. Since QCQPs are NP-hard, we propose a convex semidefinite relaxation that offers a sufficient validity condition. For single-input systems, we prove that a quadratic CLF requires constant control stabilizability, and empirically demonstrate that this extends to high-dimensional multi-input systems in many cases.
OCDec 2, 2024
Kernel-Based Optimal Control: An Infinitesimal Generator ApproachPetar Bevanda, Nicolas Hoischen, Tobias Wittmann et al.
This paper presents a novel operator-theoretic approach for optimal control of nonlinear stochastic systems within reproducing kernel Hilbert spaces. Our learning framework leverages data samples of system dynamics and stage cost functions, with only control penalties and constraints provided. The proposed method directly learns the infinitesimal generator of a controlled stochastic diffusion in an infinite-dimensional hypothesis space. We demonstrate that our approach seamlessly integrates with modern convex operator-theoretic Hamilton-Jacobi-Bellman recursions, enabling a data-driven solution to the optimal control problems. Furthermore, our learning framework includes nonparametric estimators for uncontrolled infinitesimal generators as a special case. Numerical experiments, ranging from synthetic differential equations to simulated robotic systems, showcase the advantages of our approach compared to both modern data-driven and classical nonlinear programming methods for optimal control.