Lennart Ljung

SY
18papers
878citations
Novelty28%
AI Score38

18 Papers

SYFeb 2, 2019
Nonlinear System Identification: A User-Oriented Roadmap

Johan Schoukens, Lennart Ljung

The goal of this article is twofold. Firstly, nonlinear system identification is introduced to a wide audience, guiding practicing engineers and newcomers in the field to a sound solution of their data driven modeling problems for nonlinear dynamic systems. In addition, the article also provides a broad perspective on the topic to researchers that are already familiar with the linear system identification theory, showing the similarities and differences between the linear and nonlinear problem. The reader will be referred to the existing literature for detailed mathematical explanations and formal proofs. Here the focus is on the basic philosophy, giving an intuitive understanding of the problems and the solutions, by making a guided tour along the wide range of user choices in nonlinear system identification. Guidelines will be given in addition to many examples, to reach that goal.

LGJan 30, 2023
Deep networks for system identification: a Survey

Gianluigi Pillonetto, Aleksandr Aravkin, Daniel Gedon et al.

Deep learning is a topic of considerable current interest. The availability of massive data collections and powerful software resources has led to an impressive amount of results in many application areas that reveal essential but hidden properties of the observations. System identification learns mathematical descriptions of dynamic systems from input-output data and can thus benefit from the advances of deep neural networks to enrich the possible range of models to choose from. For this reason, we provide a survey of deep learning from a system identification perspective. We cover a wide spectrum of topics to enable researchers to understand the methods, providing rigorous practical and theoretical insights into the benefits and challenges of using them. The main aim of the identified model is to predict new data from previous observations. This can be achieved with different deep learning based modelling techniques and we discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks. Their parameters have to be estimated from past data trying to optimize the prediction performance. For this purpose, we discuss a specific set of first-order optimization tools that is emerged as efficient. The survey then draws connections to the well-studied area of kernel-based methods. They control the data fit by regularization terms that penalize models not in line with prior assumptions. We illustrate how to cast them in deep architectures to obtain deep kernel-based methods. The success of deep learning also resulted in surprising empirical observations, like the counter-intuitive behaviour of models with many parameters. We discuss the role of overparameterized models, including their connection to kernels, as well as implicit regularization mechanisms which affect generalization, specifically the interesting phenomena of benign overfitting ...

SYAug 24, 2020
Dynamic Network Reconstruction from Heterogeneous Datasets

Zuogong Yue, Johan Thunberg, Wei Pan et al.

Performing multiple experiments is common when learning internal mechanisms of complex systems. These experiments can include perturbations to parameters or external disturbances. A challenging problem is to efficiently incorporate all collected data simultaneously to infer the underlying dynamic network. This paper addresses the reconstruction of dynamic networks from heterogeneous datasets under the assumption that underlying networks share the same Boolean structure across all experiments. Parametric models for dynamical structure functions are derived to describe causal interactions between measured variables. Multiple datasets are integrated into one regression problem with additional demands of group sparsity to assure network sparsity and structure consistency. To acquire structured group sparsity, we propose a sampling-based method, together with extended versions of l1 methods and sparse Bayesian learning. The performance of the proposed methods is benchmarked in numerical simulation. In summary, this paper presents efficient methods on network reconstruction from multiple experiments, and reveals practical experience that could guide applications.

SYOct 29, 2018
Systems Aliasing in Dynamic Network Reconstruction: Issues on Low Sampling Frequencies

Zuogon Yue, Johan Thunberg, Lennart Ljung et al.

Network reconstruction of dynamical continuous-time (CT) systems is motivated by applications in many fields. Due to experimental limitations, especially in biology, data could be sampled at low frequencies, leading to significant challenges in network inference. We introduce the concept of "system aliasing" and characterize the minimal sampling frequency that allows reconstruction of CT systems from low sampled data. A test criterion is also proposed to check whether system aliasing is presented. With no system aliasing, the paper provides an algorithm to reconstruct dynamic network from data in the presence of noise. In addition, when there is system aliasing we perform studies that add additional prior information of the system such as sparsity. This paper opens new directions in modelling of network systems where samples have significant costs. Such tools are essential to process the available data in applications subject to current experimental limitations.

SYNov 21, 2018
A state-space approach to sparse dynamic network reconstruction

Zuogong Yue, Johan Thunberg, Lennart Ljung et al.

Dynamic network reconstruction has been shown to be challenging due to the requirements on sparse network structures and network identifiability. The direct parametric method (e.g., using ARX models) requires a large amount of parameters in model selection. Amongst the parametric models, only a restricted class can easily be used to address network sparsity without rendering the optimization problem intractable. To overcome these problems, this paper presents a state-space-based method, which significantly reduces the number of unknown parameters in model selection. Furthermore, we avoid various difficulties arising in gradient computation by using the Expectation Minimization (EM) algorithm instead. To enhance network sparsity, the prior distribution is constructed by using the Sparse Bayesian Learning (SBL) approach in the M-step. To solve the SBL problem, another EM algorithm is embedded, where we impose conditions on network identifiability in each iteration. In a sum, this paper provides a solution to reconstruct dynamic networks that avoids the difficulties inherent to gradient computation and simplifies the model selection.

SYNov 14, 2016
Gray Box Identification of State-Space Models Using Difference of Convex Programming

Chengpu Yu, Lennart Ljung, Michel Verhaegen

Gray-box identification is prevalent in modeling physical and networked systems. However, due to the non-convex nature of the gray-box identification problem, good initial parameter estimates are crucial for a successful application. In this paper, a new identification method is proposed by exploiting the low-rank and structured Hankel matrix of impulse response. This identification problem is recasted into a difference-of-convex programming problem, which is then solved by the sequential convex programming approach with the associated initialization obtained by nuclear-norm optimization. The presented method aims to achieve the maximum impulse-response fitting while not requiring additional (non-convex) conditions to secure non-singularity of the similarity transformation relating the given state-space matrices to the gray-box parameterized ones. This overcomes a persistent shortcoming in a number of recent contributions on this topic, and the new method can be applied for the structured state-space realization even if the involved system parameters are unidentifiable. The method can be used both for directly estimating the gray-box parameters and for providing initial parameter estimates for further iterative search in a conventional gray-box identification setup.

SYApr 17, 2018
Identification of Sparse Continuous-Time Linear Systems with Low Sampling Rate: Optimization Approaches

Zuogong Yue, Johan Thunberg, Lennart Ljung et al.

This paper addresses identification of sparse linear and noise-driven continuous-time state-space systems, i.e., the right-hand sides in the dynamical equations depend only on a subset of the states. The key assumption in this study, is that the sample rate is not high enough to directly infer the continuous time system from the data. This assumption is relevant in applications where sampling is expensive or requires human intervention (e.g., biomedicine applications). We propose an iterative optimization scheme with $l_1$-regularization, where the search directions are restricted those that decrease prediction error in each iteration. We provide numerical examples illustrating the proposed method; the method outperforms the least squares estimation for large noise.

LGSep 11, 2024
Deep Learning of Dynamic Systems using System Identification Toolbox(TM)

Tianyu Dai, Khaled Aljanaideh, Rong Chen et al.

MATLAB(R) releases over the last 3 years have witnessed a continuing growth in the dynamic modeling capabilities offered by the System Identification Toolbox(TM). The emphasis has been on integrating deep learning architectures and training techniques that facilitate the use of deep neural networks as building blocks of nonlinear models. The toolbox offers neural state-space models which can be extended with auto-encoding features that are particularly suited for reduced-order modeling of large systems. The toolbox contains several other enhancements that deepen its integration with the state-of-art machine learning techniques, leverage auto-differentiation features for state estimation, and enable a direct use of raw numeric matrices and timetables for training models.

44.3SYMay 14
Randomized Atomic Feature Models for Physics-Informed Identification of Dynamic Systems

Rajiv Singh, Mario Sznaier, Lennart Ljung

We present a physics-informed framework for system identification based on randomized stable atomic features. Impulse responses are represented as random superpositions of stable atoms, namely damped complex exponentials associated with poles sampled inside a prescribed disk. Identification is then cast as a convex regularized least-squares problem with optional linear, second-order-cone, and KYP constraints. The approach generalizes random Fourier and random Laplace features to the damped, nonstationary regime relevant to engineering systems while retaining modal interpretability and scalable finite-dimensional computation. The main analytic point is an operator-theoretic Disk-Bochner viewpoint: positive measures over stable poles generate positive-definite kernels with a radius-dependent shift defect, while a converse scalar disk moment representation for an arbitrary kernel is characterized by subnormality of the canonical shift. We prove this statement, establish an RKHS-to-l1 embedding, show that sampled poles induce a valid finite atomic gauge, discuss random-feature convergence, and state sparse-recovery guarantees conditionally on the restricted-eigenvalue properties of the realized disk-Vandermonde or input-output design matrix. We also connect the normalized transfer function problem to Nevanlinna-Pick interpolation and LFT set-membership. The framework directly encodes stability margins, modal localization, DC-gain bounds, monotonicity, passivity, relative degree, settling-time targets, and time/frequency-domain error bounds. Numerical comparisons illustrate how physically meaningful priors can compensate for poor excitation and improve constrained impulse-response recovery in an under-informative data setting.

LGMar 8, 2021
A Crash Course on Reinforcement Learning

Farnaz Adib Yaghmaie, Lennart Ljung

The emerging field of Reinforcement Learning (RL) has led to impressive results in varied domains like strategy games, robotics, etc. This handout aims to give a simple introduction to RL from control perspective and discuss three possible approaches to solve an RL problem: Policy Gradient, Policy Iteration, and Model-building. Dynamical systems might have discrete action-space like cartpole where two possible actions are +1 and -1 or continuous action space like linear Gaussian systems. Our discussion covers both cases.

SYMar 31, 2020
Deep State Space Models for Nonlinear System Identification

Daniel Gedon, Niklas Wahlström, Thomas B. Schön et al.

Deep state space models (SSMs) are an actively researched model class for temporal models developed in the deep learning community which have a close connection to classic SSMs. The use of deep SSMs as a black-box identification model can describe a wide range of dynamics due to the flexibility of deep neural networks. Additionally, the probabilistic nature of the model class allows the uncertainty of the system to be modelled. In this work a deep SSM class and its parameter learning algorithm are explained in an effort to extend the toolbox of nonlinear identification methods with a deep learning based method. Six recent deep SSMs are evaluated in a first unified implementation on nonlinear system identification benchmarks.

SYJul 3, 2017
On Asymptotic Properties of Hyperparameter Estimators for Kernel-based Regularization Methods

Biqiang Mu, Tianshi Chen, Lennart Ljung

The kernel-based regularization method has two core issues: kernel design and hyperparameter estimation. In this paper, we focus on the second issue and study the properties of several hyperparameter estimators including the empirical Bayes (EB) estimator, two Stein's unbiased risk estimators (SURE) and their corresponding Oracle counterparts, with an emphasis on the asymptotic properties of these hyperparameter estimators. To this goal, we first derive and then rewrite the first order optimality conditions of these hyperparameter estimators, leading to several insights on these hyperparameter estimators. Then we show that as the number of data goes to infinity, the two SUREs converge to the best hyperparameter minimizing the corresponding mean square error, respectively, while the more widely used EB estimator converges to another best hyperparameter minimizing the expectation of the EB estimation criterion. This indicates that the two SUREs are asymptotically optimal but the EB estimator is not. Surprisingly, the convergence rate of two SUREs is slower than that of the EB estimator, and moreover, unlike the two SUREs, the EB estimator is independent of the convergence rate of $Φ^TΦ/N$ to its limit, where $Φ$ is the regression matrix and $N$ is the number of data. A Monte Carlo simulation is provided to demonstrate the theoretical results.

SYJul 2, 2015
Regularized linear system identification using atomic, nuclear and kernel-based norms: the role of the stability constraint

Gianluigi Pillonetto, Tianshi Chen, Alessandro Chiuso et al.

Inspired by ideas taken from the machine learning literature, new regularization techniques have been recently introduced in linear system identification. In particular, all the adopted estimators solve a regularized least squares problem, differing in the nature of the penalty term assigned to the impulse response. Popular choices include atomic and nuclear norms (applied to Hankel matrices) as well as norms induced by the so called stable spline kernels. In this paper, a comparative study of estimators based on these different types of regularizers is reported. Our findings reveal that stable spline kernels outperform approaches based on atomic and nuclear norms since they suitably embed information on impulse response stability and smoothness. This point is illustrated using the Bayesian interpretation of regularization. We also design a new class of regularizers defined by "integral" versions of stable spline/TC kernels. Under quite realistic experimental conditions, the new estimators outperform classical prediction error methods also when the latter are equipped with an oracle for model order selection.

SYSep 18, 2015
Identifying Biochemical Reaction Networks From Heterogeneous Datasets

Wei Pan, Ye Yuan, Lennart Ljung et al.

In this paper, we propose a new method to identify biochemical reaction networks (i.e. both reactions and kinetic parameters) from heterogeneous datasets. Such datasets can contain (a) data from several replicates of an experiment performed on a biological system; (b) data measured from a biochemical network subjected to different experimental conditions, for example, changes/perturbations in biological inductions, temperature, gene knock-out, gene over-expression, etc. Simultaneous integration of various datasets to perform system identification has the potential to avoid non-identifiability issues typically arising when only single datasets are used.

SYApr 13, 2015
Maximum entropy properties of discrete-time first-order stable spline kernel

Tianshi Chen, Tohid Ardeshiri, Francesca P. Carli et al.

The first order stable spline (SS-1) kernel is used extensively in regularized system identification. In particular, the stable spline estimator models the impulse response as a zero-mean Gaussian process whose covariance is given by the SS-1 kernel. In this paper, we discuss the maximum entropy properties of this prior. In particular, we formulate the exact maximum entropy problem solved by the SS-1 kernel without Gaussian and uniform sampling assumptions. Under general sampling schemes, we also explicitly derive the special structure underlying the SS-1 kernel (e.g. characterizing the tridiagonal nature of its inverse), also giving to it a maximum entropy covariance completion interpretation. Along the way similar maximum entropy properties of the Wiener kernel are also given.

SYApr 11, 2015
Regularized system identification using orthonormal basis functions

Tianshi Chen, Lennart Ljung

Most of existing results on regularized system identification focus on regularized impulse response estimation. Since the impulse response model is a special case of orthonormal basis functions, it is interesting to consider if it is possible to tackle the regularized system identification using more compact orthonormal basis functions. In this paper, we explore two possibilities. First, we construct reproducing kernel Hilbert space of impulse responses by orthonormal basis functions and then use the induced reproducing kernel for the regularized impulse response estimation. Second, we extend the regularization method from impulse response estimation to the more general orthonormal basis functions estimation. For both cases, the poles of the basis functions are treated as hyperparameters and estimated by empirical Bayes method. Then we further show that the former is a special case of the latter, and more specifically, the former is equivalent to ridge regression of the coefficients of the orthonormal basis functions.

OCNov 20, 2014
Maximum Entropy Kernels for System Identification

Francesca Paola Carli, Tianshi Chen, Lennart Ljung

A new nonparametric approach for system identification has been recently proposed where the impulse response is modeled as the realization of a zero-mean Gaussian process whose covariance (kernel) has to be estimated from data. In this scheme, quality of the estimates crucially depends on the parametrization of the covariance of the Gaussian process. A family of kernels that have been shown to be particularly effective in the system identification framework is the family of Diagonal/Correlated (DC) kernels. Maximum entropy properties of a related family of kernels, the Tuned/Correlated (TC) kernels, have been recently pointed out in the literature. In this paper we show that maximum entropy properties indeed extend to the whole family of DC kernels. The maximum entropy interpretation can be exploited in conjunction with results on matrix completion problems in the graphical models literature to shed light on the structure of the DC kernel. In particular, we prove that the DC kernel admits a closed-form factorization, inverse and determinant. These results can be exploited both to improve the numerical stability and to reduce the computational complexity associated with the computation of the DC estimator.

LGSep 20, 2013
Scalable Anomaly Detection in Large Homogenous Populations

Henrik Ohlsson, Tianshi Chen, Sina Khoshfetrat Pakazad et al.

Anomaly detection in large populations is a challenging but highly relevant problem. The problem is essentially a multi-hypothesis problem, with a hypothesis for every division of the systems into normal and anomal systems. The number of hypothesis grows rapidly with the number of systems and approximate solutions become a necessity for any problems of practical interests. In the current paper we take an optimization approach to this multi-hypothesis problem. We first observe that the problem is equivalent to a non-convex combinatorial optimization problem. We then relax the problem to a convex problem that can be solved distributively on the systems and that stays computationally tractable as the number of systems increase. An interesting property of the proposed method is that it can under certain conditions be shown to give exactly the same result as the combinatorial multi-hypothesis problem and the relaxation is hence tight.