Stefan Streif

9papers

8citations

Novelty43%

AI Score46

Ranked #63,684 of 201,326 authors (top 32%)#14,477 in LG (top 34%)

9 Papers

SYMar 15, 2015

Stability for Receding-horizon Stochastic Model Predictive Control

Joel A. Paulson, Stefan Streif, Ali Mesbah

A stochastic model predictive control (SMPC) approach is presented for discrete-time linear systems with arbitrary time-invariant probabilistic uncertainties and additive Gaussian process noise. Closed-loop stability of the SMPC approach is established by appropriate selection of the cost function. Polynomial chaos is used for uncertainty propagation through system dynamics. The performance of the SMPC approach is demonstrated using the Van de Vusse reactions.

SYAug 31, 2022

A stabilizing reinforcement learning approach for sampled systems with partially unknown models

Lukas Beckenbach, Pavel Osinenko, Stefan Streif

Reinforcement learning is commonly associated with training of reward-maximizing (or cost-minimizing) agents, in other words, controllers. It can be applied in model-free or model-based fashion, using a priori or online collected system data to train involved parametric architectures. In general, online reinforcement learning does not guarantee closed loop stability unless special measures are taken, for instance, through learning constraints or tailored training rules. Particularly promising are hybrids of reinforcement learning with "classical" control approaches. In this work, we suggest a method to guarantee practical stability of the system-controller closed loop in a purely online learning setting, i.e., without offline training. Moreover, we assume only partial knowledge of the system model. To achieve the claimed results, we employ techniques of classical adaptive control. The implementation of the overall control scheme is provided explicitly in a digital, sampled setting. That is, the controller receives the state of the system and computes the control action at discrete, specifically, equidistant moments in time. The method is tested in adaptive traction control and cruise control where it proved to significantly reduce the cost.

27.9LGMar 18

Benchmarking Reinforcement Learning via Stochastic Converse Optimality: Generating Systems with Known Optimal Policies

Sinan Ibrahim, Grégoire Ouerdane, Hadi Salloum et al.

The objective comparison of Reinforcement Learning (RL) algorithms is notoriously complex as outcomes and benchmarking of performances of different RL approaches are critically sensitive to environmental design, reward structures, and stochasticity inherent in both algorithmic learning and environmental dynamics. To manage this complexity, we introduce a rigorous benchmarking framework by extending converse optimality to discrete-time, control-affine, nonlinear systems with noise. Our framework provides necessary and sufficient conditions, under which a prescribed value function and policy are optimal for constructed systems, enabling the systematic generation of benchmark families via homotopy variations and randomized parameters. We validate it by automatically constructing diverse environments, demonstrating our framework's capacity for a controlled and comprehensive evaluation across algorithms. By assessing standard methods against a ground-truth optimum, our work delivers a reproducible foundation for precise and rigorous RL benchmarking.

38.9SPApr 6

A Tutorial to Multirate Extended Kalman Filter Design for Monitoring of Agricultural Anaerobic Digestion Plants

Simon Hellmann, Terrance Wilms, Stefan Streif et al.

In many applications of biotechnology, measurements are available at different sampling rates, e.g., due to online sensors and offline lab analysis. Offline measurements typically involve time delays that may be unknown a priori due to the underlying laboratory procedures. This multirate (MR) setting poses a challenge to Kalman filtering, where conventionally measurement data is assumed to be available on an equidistant time grid and without delays. This tutorial paper derives the MR version of an extended Kalman filter (EKF) based on sample state augmentation, and applies it to the anaerobic digestion (AD) process in a simulative agricultural setting. The performance of the MR-EKF is investigated for various scenarios including varying delay lengths, measurement noise levels, plant-model mismatch (PMM), and initial state error. Provided with an adequate tuning, the MR-EKF can reliably estimate the process state and, thus, appropriately fuse the delayed offline measurements and smooth the noisy online measurements. Because of the sample state augmentation approach, the delay length of offline measurements does not critically effect the performance of the state estimation, provided that observability is not lost during the delays. Poor state initialization and PMM affect convergence more than measurement noise levels. Furthermore, selecting an appropriate tuning was found to be critically important for successful application of the MR-EKF for which a systematic approach is presented. This tutorial provides implementation guidance for practitioners seeking to successfully apply state estimation for multirate systems. Thus, it contributes to the development of demand-driven operation of biogas plants, which may aid in stabilizing a renewable electricity grid.

3.1SYApr 15

Time-varying optimal control under measurement errors

Patrick Schmidt, Stefan Streif

Solving optimal control problems to determine a stabilizing controller involves a significant computational effort. Time-varying optimal control provides a remedy by designing a tracking system, given as an ordinary differential equation, to track the solution of the optimal control problem. To improve the applicability of the method, measurement errors are considered in this paper and it is described how these errors influence a control Lyapunov function-based decay condition. As a result of these investigations, input-affine constraints that meet the standard formulation and that describe the set of admissible controls are obtained. The paper also derives a requirement on the necessary measurement accuracy as well as a triggering condition for taking a new measurement. The main theorem combines these results into a robustly stabilizing control algorithm, meaning that all closed-loop trajectories starting in a vicinity around the true state converge to zero. Additionally, the tracking system ensures that the optimal control is tracked at the end of each sampling period. The effectiveness of this approach is demonstrated using a train acceleration model and the well-known predator-prey model.

11.0AIMay 11

Hierarchical Causal Abduction: A Foundation Framework for Explainable Model Predictive Control

Ramesh Arvind Naagarajan, Zühal Wagner, Stefan Streif

Model Predictive Control (MPC) is widely used to operate safety-critical infrastructure by predicting future trajectories and optimizing control actions. However, nonlinear dynamics, hard safety constraints, and numerical optimization often render individual control moves opaque to human operators, undermining trust and hindering deployment. This paper presents Hierarchical Causal Abduction (HCA), which combines (i) physics-informed reasoning via domain knowledge graphs, (ii) optimization evidence from Karush--Kuhn--Tucker (KKT) multipliers, and (iii) temporal causal discovery via the PCMCI algorithm to generate faithful, human-interpretable explanations for control actions computed by nonlinear MPC. Across three diverse control applications (greenhouse climate, building HVAC, chemical process engineering) with expert validation, HCA improves explanation accuracy by 53\% over LIME (0.478 vs. 0.311) using a single set of cross-domain parameters without per-domain tuning; domain-specific KKT-threshold calibration over 2--3 days further increases accuracy to 0.88. Ablation studies confirm that each evidence source is essential, with 32--37\% accuracy degradation when any component is removed, and HCA's ranking-and-validation methodology generalizes beyond MPC to other prediction-based decision systems, including learning-based control and trajectory planning.

LGJan 8, 2021

On the Turnpike to Design of Deep Neural Nets: Explicit Depth Bounds

Timm Faulwasser, Arne-Jens Hempel, Stefan Streif

It is well-known that the training of Deep Neural Networks (DNN) can be formalized in the language of optimal control. In this context, this paper leverages classical turnpike properties of optimal control problems to attempt a quantifiable answer to the question of how many layers should be considered in a DNN. The underlying assumption is that the number of neurons per layer -- i.e., the width of the DNN -- is kept constant. Pursuing a different route than the classical analysis of approximation properties of sigmoidal functions, we prove explicit bounds on the required depths of DNNs based on asymptotic reachability assumptions and a dissipativity-inducing choice of the regularization terms in the training problem. Numerical results obtained for the two spiral task data set for classification indicate that the proposed estimates can provide non-conservative depth bounds.

SYNov 11, 2014

A Probabilistic Approach to Robust Optimal Experiment Design with Chance Constraints

Ali Mesbah, Stefan Streif

Accurate estimation of parameters is paramount in developing high-fidelity models for complex dynamical systems. Model-based optimal experiment design (OED) approaches enable systematic design of dynamic experiments to generate input-output data sets with high information content for parameter estimation. Standard OED approaches however face two challenges: (i) experiment design under incomplete system information due to unknown true parameters, which usually requires many iterations of OED; (ii) incapability of systematically accounting for the inherent uncertainties of complex systems, which can lead to diminished effectiveness of the designed optimal excitation signal as well as violation of system constraints. This paper presents a robust OED approach for nonlinear systems with arbitrarily-shaped time-invariant probabilistic uncertainties. Polynomial chaos is used for efficient uncertainty propagation. The distinct feature of the robust OED approach is the inclusion of chance constraints to ensure constraint satisfaction in a stochastic setting. The presented approach is demonstrated by optimal experimental design for the JAK-STAT5 signaling pathway that regulates various cellular processes in a biological cell.

OCOct 16, 2014

Stochastic Nonlinear Model Predictive Control with Efficient Sample Approximation of Chance Constraints

Stefan Streif, Matthias Karl, Ali Mesbah

This paper presents a stochastic model predictive control approach for nonlinear systems subject to time-invariant probabilistic uncertainties in model parameters and initial conditions. The stochastic optimal control problem entails a cost function in terms of expected values and higher moments of the states, and chance constraints that ensure probabilistic constraint satisfaction. The generalized polynomial chaos framework is used to propagate the time-invariant stochastic uncertainties through the nonlinear system dynamics, and to efficiently sample from the probability densities of the states to approximate the satisfaction probability of the chance constraints. To increase computational efficiency by avoiding excessive sampling, a statistical analysis is proposed to systematically determine a-priori the least conservative constraint tightening required at a given sample size to guarantee a desired feasibility probability of the sample-approximated chance constraint optimization problem. In addition, a method is presented for sample-based approximation of the analytic gradients of the chance constraints, which increases the optimization efficiency significantly. The proposed stochastic nonlinear model predictive control approach is applicable to a broad class of nonlinear systems with the sufficient condition that each term is analytic with respect to the states, and separable with respect to the inputs, states and parameters. The closed-loop performance of the proposed approach is evaluated using the Williams-Otto reactor with seven states, and ten uncertain parameters and initial conditions. The results demonstrate the efficiency of the approach for real-time stochastic model predictive control and its capability to systematically account for probabilistic uncertainties in contrast to a nonlinear model predictive control approaches.