SYMay 27
Local Observability and Moving Horizon Estimation-based Training of Feedforward Neural NetworksYi Yang, Victor G. Lopez, Matthias A. Müller
In this paper, we propose a moving horizon estimation (MHE)-based training method for feedforward neural networks (FNNs) with rectified linear unit (ReLU) activation functions to determine their ideal weights from a control-theoretic perspective. This allows for a rigorous theoretical analysis of the trained network. First, we reformulate the FNN as a dynamical system with the weights as states. Then, we investigate the local observability of such a system. For two-layer FNNs with fixed output weights, we derive a sufficient condition under which the observability rank condition holds, ensuring a locally observable state. We also show that multi-layer FNNs in general fail to satisfy the observability rank condition. Based on this analysis, we develop a persistently exciting (PE) input design method, which renders a state distinguishable from its neighbors. The resulting local observability provides convergence guarantees for the proposed MHE-based training, where only the projection of the state onto the observable subspace is updated using a fixed-length window of input-output data. The effectiveness of the approach is illustrated via numerical examples.
SYMay 7
Data-based Moving Horizon Estimation under Irregularly Measured DataTobias M. Wolff, Isabelle Krauss, Victor G. Lopez et al. · tsinghua
In this work, we introduce a sample- and data-based moving horizon estimation framework for linear systems. We perform state estimation in a sample-based fashion in the sense that we assume to have only few, irregular output measurements available. This setting is encountered in applications where measuring is expensive or time-consuming. Furthermore, the state estimation framework does not rely on a standard mathematical model, but on an implicit system representation based on measured data. We prove sample-based practical robust exponential stability of the proposed estimator under mild assumptions. Furthermore, we apply the proposed scheme to estimate the states of a gastrointestinal tract absorption system.
SYMar 31, 2023
An Efficient Off-Policy Reinforcement Learning Algorithm for the Continuous-Time LQR ProblemVictor G. Lopez, Matthias A. Müller
In this paper, an off-policy reinforcement learning algorithm is designed to solve the continuous-time LQR problem using only input-state data measured from the system. Different from other algorithms in the literature, we propose the use of a specific persistently exciting input as the exploration signal during the data collection step. We then show that, using this persistently excited data, the solution of the matrix equation in our algorithm is guaranteed to exist and to be unique at every iteration. Convergence of the algorithm to the optimal control input is also proven. Moreover, we formulate the policy evaluation step as the solution of a Sylvester-transpose equation, which increases the efficiency of its solution. Finally, a method to determine a stabilizing policy to initialize the algorithm using only measured data is proposed.
SYMay 14
Data-Based Control of Continuous-Time Linear Systems with Performance SpecificationsVictor G. Lopez, Matthias A. Müller
The design of direct data-based controllers has become a fundamental part of control theory research in the last few years. In this paper, we consider three classes of data-based state feedback control problems for linear systems. These control problems are such that, besides stabilization, some additional performance requirements must be satisfied. First, we formulate and solve a trajectory-reference control problem, on which desired closed-loop trajectories are known and a controller that allows the system to closely follow those trajectories is computed. Then, the solution of the LQR problem for continuous-time systems is presented. Finally, we consider the case in which the precise position of the desired poles of the closed-loop system is known, and introduce a data-based variant of a robust pole-placement procedure. The applicability of the proposed methods is tested using numerical simulations.
SYMay 14
On Data-based Nash Equilibria in LQ Nonzero-sum Differential GamesVictor G. Lopez, Matthias A. Müller
This paper considers data-based solutions of linear-quadratic nonzero-sum differential games. Two cases are considered. First, the deterministic game is solved and Nash equilibrium strategies are obtained by using persistently excited data from the multiagent system. Then, a stochastic formulation of the game is considered, where each agent measures a different noisy output signal and state observers must be designed for each player. It is shown that the proposed data-based solutions of these games are equivalent to known model-based procedures. The resulting data-based solutions are validated in a numerical experiment.
SYMay 12
Estimating Hormone Concentrations in the Pituitary-Thyroid Feedback Loop from Irregularly Sampled MeasurementsSeth Siriya, Tobias M. Wolff, Isabelle Krauss et al.
Model-based control techniques have recently been investigated for the recommendation of medication dosages to address thyroid diseases. These techniques often rely on knowledge of internal hormone concentrations that cannot be measured from blood samples. Moreover, the measurable concentrations are typically only obtainable at irregular sampling times. In this work, we empirically verify a notion of sample-based detectability that accounts for irregular sampling of the measurable concentrations on two pituitary-thyroid loop models representing patients with hypo- and hyperthyroidism, respectively, and include the internal concentrations as states. We then implement sample-based moving horizon estimation for the models, and test its performance on virtual patients across a range of sampling schemes. Our study shows robust stability of the estimator across all scenarios, and that more frequent sampling leads to less estimation error in the presence of model uncertainty and misreported dosages.
SYMay 22
Beyond Shrinkage: Foundations of Data-Driven Control for Piecewise Affine SystemsGianluca Giacomelli, Victor G. Lopez, Simone Formentin et al.
Data-enabled predictive control (DeePC) has recently attracted attention as a promising approach for controlling systems directly from raw data, without requiring an explicit identification step. However, DeePC has not yet been extended to piecewise affine (PWA) systems, despite their extensive use in the (predictive) control literature and their universal approximation capabilities. To address this gap, in this work, we lay the foundations for data-enabled predictive control of PWA systems, providing: $(i)$ their behavioral characterization; $(ii)$ an extension of Willems' Fundamental Lemma to represent their behavior from raw data; $(iii)$ an analysis of the coherence of DeePC strategies using a linear predictor and shrinkage regularizers; and $(iv)$ a study of the impact of misclassification errors on structuring data for prediction. Our theoretical findings are validated by numerical results on a simple example, emphasizing the need to extend beyond a regularized version of the foundational DeePC framework to design control actions that are both effective and coherent with a PWA system's behavior, thus ensuring the controller's explainability.
SYApr 24
Robust stability of event-triggered nonlinear moving horizon estimationIsabelle Krauss, Victor G. Lopez, Matthias A. Müller
In this work, we propose an event-triggered moving horizon estimation (ET-MHE) scheme for the remote state estimation of general nonlinear systems. In the presented method, whenever an event is triggered, a single measurement is transmitted and the nonlinear MHE optimization problem is subsequently solved. If no event is triggered, the current state estimate is updated using an open-loop prediction based on the system dynamics. Moreover, we introduce a novel event-triggering rule under which we demonstrate robust global exponential stability of the ET-MHE scheme, assuming a suitable detectability condition is met. In addition, we show that with the adoption of a varying horizon length, a tighter bound on the estimation error can be achieved. Finally, we validate the effectiveness of the proposed method through two illustrative examples.
SYMar 23
Sample-based Moving Horizon EstimationIsabelle Krauss, Victor G. Lopez, Matthias A. Müller
In this paper, we propose a sample-based moving horizon estimation (MHE) scheme for general nonlinear systems to estimate the current system state using irregularly and/or infrequently available measurements. The cost function of the MHE optimization problem is suitably designed to accommodate these irregular output sequences. We also establish that, under a suitable sample-based detectability condition known as sample-based incremental input/output-to-state stability (i-IOSS), the proposed sample-based MHE achieves robust global exponential stability (RGES). Additionally, for the case of linear systems, we draw connections between sample-based observability and sample-based i-IOSS. This demonstrates that previously established conditions for linear systems to be sample-based observable can be utilized to verify or design sampling strategies that satisfy the conditions to guarantee RGES of the sample-based MHE. Finally, the effectiveness of the proposed sample-based MHE is illustrated through a simulation example.
SYMar 23
Sample-based detectability and moving horizon state estimation of continuous-time systemsIsabelle Krauss, Victor G. Lopez, Matthias A. Müller
In this paper we propose a detectability condition for nonlinear continuous-time systems with irregular/infrequent output measurements, namely a sample-based version of incremental integral input/output-to-state stability (i-iIOSS). We provide a sufficient condition for an i-iIOSS system to be sample-based i-iIOSS. This condition is also exploited to analyze the relationship between sample-based i-iIOSS and sample-based observability for linear systems, such that previously established sampling strategies for linear systems can be used to guarantee sample-based i-iIOSS. Furthermore, we present a sample-based moving horizon estimation scheme, for which robust stability can be shown. Finally, we illustrate the applicability of the proposed estimation scheme through a biomedical simulation example.
LGFeb 11
Tuning the burn-in phase in training recurrent neural networks improves their performanceJulian D. Schiller, Malte Heinrich, Victor G. Lopez et al.
Training recurrent neural networks (RNNs) with standard backpropagation through time (BPTT) can be challenging, especially in the presence of long input sequences. A practical alternative to reduce computational and memory overhead is to perform BPTT repeatedly over shorter segments of the training data set, corresponding to truncated BPTT. In this paper, we examine the training of RNNs when using such a truncated learning approach for time series tasks. Specifically, we establish theoretical bounds on the accuracy and performance loss when optimizing over subsequences instead of the full data sequence. This reveals that the burn-in phase of the RNN is an important tuning knob in its training, with significant impact on the performance guarantees. We validate our theoretical results through experiments on standard benchmarks from the fields of system identification and time series forecasting. In all experiments, we observe a strong influence of the burn-in phase on the training process, and proper tuning can lead to a reduction of the prediction error on the training and test data of more than 60% in some cases.
SYMar 31
An Output Feedback Q-learning Algorithm for Optimal Control of Nonlinear Systems with Koopman Linear EmbeddingVictor G. Lopez, Malte Heinrich, Matthias A. Müller
In the reinforcement learning literature, strong theoretical guarantees have been obtained for algorithms applicable to LTI systems. However, in the nonlinear case only weaker results have been obtained for algorithms that mostly rely on the use of function approximation strategies like, for example, neural networks. In this paper, we study the applicability of a known output-feedback Q-learning algorithm to the class of nonlinear systems that admit a Koopman linear embedding. This algorithm uses only input-output data, and no knowledge of either the system model or the Koopman lifting functions is required. Moreover, no function approximation techniques are used, and the same theoretical guarantees as for LTI systems are preserved. Furthermore, we analyze the performance of the algorithm when the Koopman linear embedding is only an approximation of the real nonlinear system. A simulation example verifies the applicability of this method.