Cristian R. Rojas

h-index24

23papers

190citations

Novelty40%

AI Score26

Ranked #162,850 of 194,257 authors (top 84%)#978 in SY (top 60%)

23 Papers

1.2SYMar 26, 2018

Parametric Identification Using Weighted Null-Space Fitting

Miguel Galrinho, Cristian R. Rojas, Hakan Hjalmarsson

In identification of dynamical systems, the prediction error method using a quadratic cost function provides asymptotically efficient estimates under Gaussian noise and additional mild assumptions, but in general it requires solving a non-convex optimization problem. An alternative class of methods uses a non-parametric model as intermediate step to obtain the model of interest. Weighted null-space fitting (WNSF) belongs to this class. It is a weighted least-squares method consisting of three steps. In the first step, a high-order ARX model is estimated. In a second least-squares step, this high-order estimate is reduced to a parametric estimate. In the third step, weighted least squares is used to reduce the variance of the estimates. The method is flexible in parametrization and suitable for both open- and closed-loop data. In this paper, we show that WNSF provides estimates with the same asymptotic properties as PEM with a quadratic cost function when the model orders are chosen according to the true system. Also, simulation studies indicate that WNSF may be competitive with state-of-the-art methods.

1.2SYMar 21, 2013

Application Set Approximation in Optimal Input Design for Model Predictive Control

Afrooz Ebadat, Mariette Annergren, Christian A. Larsson et al.

This contribution considers one central aspect of experiment design in system identification. When a control design is based on an estimated model, the achievable performance is related to the quality of the estimate. The degradation in control performance due to errors in the estimated model is measured by an application cost function. In order to use an optimization based input design method, a convex approximation of the set of models that atisfies the control specification is required. The standard approach is to use a quadratic approximation of the application cost function, where the main computational effort is to find the corresponding Hessian matrix. Our main contribution is an alternative approach for this problem, which uses the structure of the underlying optimal control problem to considerably reduce the computations needed to find the application set. This technique allows the use of applications oriented input design for MPC on much more complex plants. The approach is numerically evaluated on a distillation control problem.

14.9LGJun 12, 2023

DRCFS: Doubly Robust Causal Feature Selection

Francesco Quinzan, Ashkan Soleymani, Patrick Jaillet et al.

Knowing the features of a complex system that are highly relevant to a particular target variable is of fundamental interest in many areas of science. Existing approaches are often limited to linear settings, sometimes lack guarantees, and in most cases, do not scale to the problem at hand, in particular to images. We propose DRCFS, a doubly robust feature selection method for identifying the causal features even in nonlinear and high dimensional settings. We provide theoretical guarantees, illustrate necessary conditions for our assumptions, and perform extensive experiments across a wide range of simulated and semi-synthetic datasets. DRCFS significantly outperforms existing state-of-the-art methods, selecting robust features even in challenging highly non-linear and high-dimensional problems.

1.2SYApr 3, 2017

Computing monotone policies for Markov decision processes: a nearly-isotonic penalty approach

Robert Mattila, Cristian R. Rojas, Vikram Krishnamurthy et al.

This paper discusses algorithms for solving Markov decision processes (MDPs) that have monotone optimal policies. We propose a two-stage alternating convex optimization scheme that can accelerate the search for an optimal policy by exploiting the monotone property. The first stage is a linear program formulated in terms of the joint state-action probabilities. The second stage is a regularized problem formulated in terms of the conditional probabilities of actions given states. The regularization uses techniques from nearly-isotonic regression. While a variety of iterative method can be used in the first formulation of the problem, we show in numerical simulations that, in particular, the alternating method of multipliers (ADMM) can be significantly accelerated using the regularization step.

1.2SYJan 13, 2015

Variance Analysis of Linear SIMO Models with Spatially Correlated Noise

Niklas Everitt, Giulio Bottegal, Cristian R. Rojas et al.

Substantial improvement in accuracy of identified linear time-invariant single-input multi-output (SIMO) dynamical models is possible when the disturbances affecting the output measurements are spatially correlated. Using an orthogonal representation for the modules composing the SIMO structure, in this paper we show that the variance of a parameter estimate of a module is dependent on the model structure of the other modules, and the correlation structure of the disturbances. In addition, we quantify the variance-error for the parameter estimates for finite model orders, where the effect of noise correlation structure, model structure and signal spectra are visible. From these results, we derive the noise correlation structure under which the mentioned model parameterization gives the lowest variance, when one module is identified using less parameters than the other modules.

1.2SYMar 22, 2018

An asymptotically optimal indirect approach to continuous-time system identification

Rodrigo A. González, Cristian R. Rojas, James S. Welsh

The indirect approach to continuous-time system identification consists in estimating continuous-time models by first determining an appropriate discrete-time model. For a zero-order hold sampling mechanism, this approach usually leads to a transfer function estimate with relative degree 1, independent of the relative degree of the strictly proper real system. In this paper, a refinement of these methods is developed. Inspired by indirect PEM, we propose a method that enforces a fixed relative degree in the continuous-time transfer function estimate, and show that the resulting estimator is consistent and asymptotically efficient. Extensive numerical simulations are put forward to show the performance of this estimator when contrasted with other indirect and direct methods for continuous-time system identification.

2.0LGApr 4, 2023

Optimal Transport for Correctional Learning

Rebecka Winqvist, Inês Lourenco, Francesco Quinzan et al.

The contribution of this paper is a generalized formulation of correctional learning using optimal transport, which is about how to optimally transport one mass distribution to another. Correctional learning is a framework developed to enhance the accuracy of parameter estimation processes by means of a teacher-student approach. In this framework, an expert agent, referred to as the teacher, modifies the data used by a learning agent, known as the student, to improve its estimation process. The objective of the teacher is to alter the data such that the student's estimation error is minimized, subject to a fixed intervention budget. Compared to existing formulations of correctional learning, our novel optimal transport approach provides several benefits. It allows for the estimation of more complex characteristics as well as the consideration of multiple intervention policies for the teacher. We evaluate our approach on two theoretical examples, and on a human-robot interaction application in which the teacher's role is to improve the robots performance in an inverse reinforcement learning setting.

1.2SYNov 20, 2023

Unraveling the Control Engineer's Craft with Neural Networks

Braghadeesh Lakshminarayanan, Federico Dettù, Cristian R. Rojas et al.

Many industrial processes require suitable controllers to meet their performance requirements. More often, a sophisticated digital twin is available, which is a highly complex model that is a virtual representation of a given physical process, whose parameters may not be properly tuned to capture the variations in the physical process. In this paper, we present a sim2real, direct data-driven controller tuning approach, where the digital twin is used to generate input-output data and suitable controllers for several perturbations in its parameters. State-of-the art neural-network architectures are then used to learn the controller tuning rule that maps input-output data onto the controller parameters, based on artificially generated data from perturbed versions of the digital twin. In this way, as far as we are aware, we tackle for the first time the problem of re-calibrating the controller by meta-learning the tuning rule directly from data, thus practically replacing the control engineer with a machine learning model. The benefits of this methodology are illustrated via numerical simulations for several choices of neural-network architectures.

2.3MAApr 15, 2024

Kernel-based learning with guarantees for multi-agent applications

Krzysztof Kowalczyk, Paweł Wachel, Cristian R. Rojas

This paper addresses a kernel-based learning problem for a network of agents locally observing a latent multidimensional, nonlinear phenomenon in a noisy environment. We propose a learning algorithm that requires only mild a priori knowledge about the phenomenon under investigation and delivers a model with corresponding non-asymptotic high probability error bounds. Both non-asymptotic analysis of the method and numerical simulation results are presented and discussed in the paper.

5.9MLMay 5, 2023Code

Decentralized diffusion-based learning under non-parametric limited prior knowledge

Paweł Wachel, Krzysztof Kowalczyk, Cristian R. Rojas

We study the problem of diffusion-based network learning of a nonlinear phenomenon, $m$, from local agents' measurements collected in a noisy environment. For a decentralized network and information spreading merely between directly neighboring nodes, we propose a non-parametric learning algorithm, that avoids raw data exchange and requires only mild \textit{a priori} knowledge about $m$. Non-asymptotic estimation error bounds are derived for the proposed method. Its potential applications are illustrated through simulation experiments.

3.1LGNov 15, 2021

A Teacher-Student Markov Decision Process-based Framework for Online Correctional Learning

Inês Lourenço, Rebecka Winqvist, Cristian R. Rojas et al.

A classical learning setting typically concerns an agent/student who collects data, or observations, from a system in order to estimate a certain property of interest. Correctional learning is a type of cooperative teacher-student framework where a teacher, who has partial knowledge about the system, has the ability to observe and alter (correct) the observations received by the student in order to improve the accuracy of its estimate. In this paper, we show how the variance of the estimate of the student can be reduced with the help of the teacher. We formulate the corresponding online problem - where the teacher has to decide, at each time instant, whether or not to change the observations due to a limited budget - as a Markov decision process, from which the optimal policy is derived using dynamic programming. We validate the framework in numerical experiments, and compare the optimal online policy with the one from the batch setting.

1.6LGMay 28, 2021

Asymptotically Optimal Bandits under Weighted Information

Matias I. Müller, Cristian R. Rojas

We study the problem of regret minimization in a multi-armed bandit setup where the agent is allowed to play multiple arms at each round by spreading the resources usually allocated to only one arm. At each iteration the agent selects a normalized power profile and receives a Gaussian vector as outcome, where the unknown variance of each sample is inversely proportional to the power allocated to that arm. The reward corresponds to a linear combination of the power profile and the outcomes, resembling a linear bandit. By spreading the power, the agent can choose to collect information much faster than in a traditional multi-armed bandit at the price of reducing the accuracy of the samples. This setup is fundamentally different from that of a linear bandit -- the regret is known to scale as $Θ(\sqrt{T})$ for linear bandits, while in this setup the agent receives a much more detailed feedback, for which we derive a tight $\log(T)$ problem-dependent lower-bound. We propose a Thompson-Sampling-based strategy, called Weighted Thompson Sampling (\WTS), that designs the power profile as its posterior belief of each arm being the best arm, and show that its upper bound matches the derived logarithmic lower bound. Finally, we apply this strategy to a problem of control and system identification, where the goal is to estimate the maximum gain (also called $\mathcal{H}_\infty$-norm) of a linear dynamical system based on batches of input-output samples.

5.7MLDec 17, 2019

A Finite-Sample Deviation Bound for Stable Autoregressive Processes

Rodrigo A. González, Cristian R. Rojas

In this paper, we study non-asymptotic deviation bounds of the least squares estimator in Gaussian AR($n$) processes. By relying on martingale concentration inequalities and a tail-bound for $χ^2$ distributed variables, we provide a concentration bound for the sample covariance matrix of the process output. With this, we present a problem-dependent finite-time bound on the deviation probability of any fixed linear combination of the estimated parameters of the AR$(n)$ process. We discuss extensions and limitations of our approach.

5.7MLDec 3, 2019

Bayesian Model Selection for Change Point Detection and Clustering

Othmane Mazhar, Cristian R. Rojas, Carlo Fischione et al.

We address the new problem of estimating a piece-wise constant signal with the purpose of detecting its change points and the levels of clusters. Our approach is to model it as a nonparametric penalized least square model selection on a family of models indexed over the collection of partitions of the design points and propose a computationally efficient algorithm to approximately solve it. Statistically, minimizing such a penalized criterion yields an approximation to the maximum a posteriori probability (MAP) estimator. The criterion is then analyzed and an oracle inequality is derived using a Gaussian concentration inequality. The oracle inequality is used to derive on one hand conditions for consistency and on the other hand an adaptive upper bound on the expected square risk of the estimator, which statistically motivates our approximation. Finally, we apply our algorithm to simulated data to experimentally validate the statistical guarantees and illustrate its behavior.

1.2SYSep 6, 2018

Estimating Models with High-Order Noise Dynamics Using Semi-Parametric Weighted Null-Space Fitting

Miguel Galrinho, Cristian R. Rojas, Hakan Hjalmarsson

Standard system identification methods often provide inconsistent estimates with closed-loop data. With the prediction error method (PEM), this issue is solved by using a noise model that is flexible enough to capture the noise spectrum. However, a too flexible noise model (i.e., too many parameters) increases the model complexity, which can cause additional numerical problems for PEM. In this paper, we consider the weighted null-space fitting (WNSF) method. With this method, the system is first modeled using a non-parametric ARX model, which is then reduced to a parametric model of interest using weighted least squares. In the reduction step, a parametric noise model does not need to be estimated if it is not of interest. Because the flexibility of the noise model is increased with the sample size, this will still provide consistent estimates in closed loop and asymptotically efficient estimates in open loop. In this paper, we prove these results, and we derive the asymptotic covariance for the estimation error obtained in closed loop, which is optimal for an infinite-order noise model. For this purpose, we also derive a new technical result for geometric variance analysis, instrumental to our end. Finally, we perform a simulation study to illustrate the benefits of the method when the noise model cannot be parametrized by a low-order model.

1.2ITJul 26, 2015

Estimator Selection: End-Performance Metric Aspects

Dimitrios Katselis, Cristian R. Rojas, Carolyn L. Beck

Recently, a framework for application-oriented optimal experiment design has been introduced. In this context, the distance of the estimated system from the true one is measured in terms of a particular end-performance metric. This treatment leads to superior unknown system estimates to classical experiment designs based on usual pointwise functional distances of the estimated system from the true one. The separation of the system estimator from the experiment design is done within this new framework by choosing and fixing the estimation method to either a maximum likelihood (ML) approach or a Bayesian estimator such as the minimum mean square error (MMSE). Since the MMSE estimator delivers a system estimate with lower mean square error (MSE) than the ML estimator for finite-length experiments, it is usually considered the best choice in practice in signal processing and control applications. Within the application-oriented framework a related meaningful question is: Are there end-performance metrics for which the ML estimator outperforms the MMSE when the experiment is finite-length? In this paper, we affirmatively answer this question based on a simple linear Gaussian regression example.

2.8MLJul 22, 2015

Evaluation of Spectral Learning for the Identification of Hidden Markov Models

Robert Mattila, Cristian R. Rojas, Bo Wahlberg

Hidden Markov models have successfully been applied as models of discrete time series in many fields. Often, when applied in practice, the parameters of these models have to be estimated. The currently predominating identification methods, such as maximum-likelihood estimation and especially expectation-maximization, are iterative and prone to have problems with local minima. A non-iterative method employing a spectral subspace-like approach has recently been proposed in the machine learning literature. This paper evaluates the performance of this algorithm, and compares it to the performance of the expectation-maximization algorithm, on a number of numerical examples. We find that the performance is mixed; it successfully identifies some systems with relatively few available observations, but fails completely for some systems even when a large amount of observations is available. An open question is how this discrepancy can be explained. We provide some indications that it could be related to how well-conditioned some system parameters are.

1.5MLJan 23, 2015

Bayesian Learning for Low-Rank matrix reconstruction

Martin Sundin, Cristian R. Rojas, Magnus Jansson et al.

We develop latent variable models for Bayesian learning based low-rank matrix completion and reconstruction from linear measurements. For under-determined systems, the developed methods are shown to reconstruct low-rank matrices when neither the rank nor the noise power is known a-priori. We derive relations between the latent variable models and several low-rank promoting penalty functions. The relations justify the use of Kronecker structured covariance matrices in a Gaussian based prior. In the methods, we use evidence approximation and expectation-maximization to learn the model parameters. The performance of the methods is evaluated through extensive numerical simulations.

1.2SYApr 20, 2015

Approximate Regularization Paths for Nuclear Norm Minimization Using Singular Value Bounds -- With Implementation and Extended Appendix

Niclas Blomberg, Cristian R. Rojas, Bo Wahlberg

The widely used nuclear norm heuristic for rank minimization problems introduces a regularization parameter which is difficult to tune. We have recently proposed a method to approximate the regularization path, i.e., the optimal solution as a function of the parameter, which requires solving the problem only for a sparse set of points. In this paper, we extend the algorithm to provide error bounds for the singular values of the approximation. We exemplify the algorithms on large scale benchmark examples in model order reduction. Here, the order of a dynamical system is reduced by means of constrained minimization of the nuclear norm of a Hankel matrix.

2.3STDec 1, 2014

How to monitor and mitigate stair-casing in l1 trend filtering

Cristian R. Rojas, Bo Wahlberg

In this paper we study the estimation of changing trends in time-series using $\ell_1$ trend filtering. This method generalizes 1D Total Variation (TV) denoising for detection of step changes in means to detecting changes in trends, and it relies on a convex optimization problem for which there are very efficient numerical algorithms. It is known that TV denoising suffers from the so-called stair-case effect, which leads to detecting false change points. The objective of this paper is to show that $\ell_1$ trend filtering also suffers from a certain stair-case problem. The analysis is based on an interpretation of the dual variables of the optimization problem in the method as integrated random walk. We discuss consistency conditions for $\ell_1$ trend filtering, how to monitor their fulfillment, and how to modify the algorithm to avoid the stair-case false detection problem.

1.2SYJul 22, 2014

Approximate Regularization Path for Nuclear Norm Based H2 Model Reduction

Niclas Blomberg, Cristian R. Rojas, Bo Wahlberg

This paper concerns model reduction of dynamical systems using the nuclear norm of the Hankel matrix to make a trade-off between model fit and model complexity. This results in a convex optimization problem where this trade-off is determined by one crucial design parameter. The main contribution is a methodology to approximately calculate all solutions up to a certain tolerance to the model reduction problem as a function of the design parameter. This is called the regularization path in sparse estimation and is a very important tool in order to find the appropriate balance between fit and complexity. We extend this to the more complicated nuclear norm case. The key idea is to determine when to exactly calculate the optimal solution using an upper bound based on the so-called duality gap. Hence, by solving a fixed number of optimization problems the whole regularization path up to a given tolerance can be efficiently computed. We illustrate this approach on some numerical examples.

3.3NAJun 30, 2014

Relevance Singular Vector Machine for low-rank matrix sensing

Martin Sundin, Saikat Chatterjee, Magnus Jansson et al.

In this paper we develop a new Bayesian inference method for low rank matrix reconstruction. We call the new method the Relevance Singular Vector Machine (RSVM) where appropriate priors are defined on the singular vectors of the underlying matrix to promote low rank. To accelerate computations, a numerically efficient approximation is developed. The proposed algorithms are applied to matrix completion and matrix reconstruction problems and their performance is studied numerically.

9.2STJan 21, 2014

On change point detection using the fused lasso method

Cristian R. Rojas, Bo Wahlberg

In this paper we analyze the asymptotic properties of l1 penalized maximum likelihood estimation of signals with piece-wise constant mean values and/or variances. The focus is on segmentation of a non-stationary time series with respect to changes in these model parameters. This change point detection and estimation problem is also referred to as total variation denoising or l1 -mean filtering and has many important applications in most fields of science and engineering. We establish the (approximate) sparse consistency properties, including rate of convergence, of the so-called fused lasso signal approximator (FLSA). We show that this only holds if the sign of the corresponding consecutive changes are all different, and that this estimator is otherwise incapable of correctly detecting the underlying sparsity pattern. The key idea is to notice that the optimality conditions for this problem can be analyzed using techniques related to brownian bridge theory.