Bo Wahlberg

SY
h-index2
29papers
572citations
Novelty47%
AI Score48

29 Papers

SYMar 19, 2012
An ADMM Algorithm for Solving l_1 Regularized MPC

Mariette Annergren, Anders Hansson, Bo Wahlberg

We present an Alternating Direction Method of Multipliers (ADMM) algorithm for solving optimization problems with an l_1 regularized least-squares cost function subject to recursive equality constraints. The considered optimization problem has applications in control, for example in l_1 regularized MPC. The ADMM algorithm is easy to implement, converges fast to a solution of moderate accuracy, and enables separation of the optimization problem into sub-problems that may be solved in parallel. We show that the most costly step of the proposed ADMM algorithm is equivalent to solving an LQ regulator problem with an extra linear term in the cost function, a problem that can be solved efficiently using a Riccati recursion. We apply the ADMM algorithm to an example of l_1 regularized MPC. The numerical examples confirm fast convergence to moderate accuracy and a linear complexity in the MPC prediction horizon.

SYMar 21, 2013
Application Set Approximation in Optimal Input Design for Model Predictive Control

Afrooz Ebadat, Mariette Annergren, Christian A. Larsson et al.

This contribution considers one central aspect of experiment design in system identification. When a control design is based on an estimated model, the achievable performance is related to the quality of the estimate. The degradation in control performance due to errors in the estimated model is measured by an application cost function. In order to use an optimization based input design method, a convex approximation of the set of models that atisfies the control specification is required. The standard approach is to use a quadratic approximation of the application cost function, where the main computational effort is to find the corresponding Hessian matrix. Our main contribution is an alternative approach for this problem, which uses the structure of the underlying optimal control problem to considerably reduce the computations needed to find the application set. This technique allows the use of applications oriented input design for MPC on much more complex plants. The approach is numerically evaluated on a distillation control problem.

SYFeb 1, 2017
Asymptotically Efficient Identification of Known-Sensor Hidden Markov Models

Robert Mattila, Cristian R. Rojas, Vikram Krishnamurthy et al.

We consider estimating the transition probability matrix of a finite-state finite-observation alphabet hidden Markov model with known observation probabilities. The main contribution is a two-step algorithm; a method of moments estimator (formulated as a convex optimization problem) followed by a single iteration of a Newton-Raphson maximum likelihood estimator. The two-fold contribution of this letter is, firstly, to theoretically show that the proposed estimator is consistent and asymptotically efficient, and secondly, to numerically show that the method is computationally less demanding than conventional methods - in particular for large data sets.

SYApr 3, 2017
Computing monotone policies for Markov decision processes: a nearly-isotonic penalty approach

Robert Mattila, Cristian R. Rojas, Vikram Krishnamurthy et al.

This paper discusses algorithms for solving Markov decision processes (MDPs) that have monotone optimal policies. We propose a two-stage alternating convex optimization scheme that can accelerate the search for an optimal policy by exploiting the monotone property. The first stage is a linear program formulated in terms of the joint state-action probabilities. The second stage is a regularized problem formulated in terms of the conditional probabilities of actions given states. The regularization uses techniques from nearly-isotonic regression. While a variety of iterative method can be used in the first formulation of the problem, we show in numerical simulations that, in particular, the alternating method of multipliers (ADMM) can be significantly accelerated using the regularization step.

SYNov 7, 2016
A Markov Decision Process Model to Guide Treatment of Abdominal Aortic Aneurysms

Robert Mattila, Antti Siika, Joy Roy et al.

An abdominal aortic aneurysm (AAA) is an enlargement of the abdominal aorta which, if left untreated, can progressively widen and may rupture with fatal consequences. In this paper, we determine an optimal treatment policy using Markov decision process modeling. The policy is optimal with respect to the number of quality adjusted life-years (QALYs) that are expected to be accumulated during the remaining life of a patient. The new policy takes into account factors that are ignored by the current clinical policy (e.g. the life-expectancy and the age-dependent surgical mortality). The resulting optimal policy is structurally different from the current policy. In particular, the policy suggests that young patients with small aneurysms should undergo surgery. The robustness of the policy structure is demonstrated using simulations. A gain in the number of expected QALYs is shown, which indicates a possibility of improved care for patients with AAAs.

SYMar 13, 2013
On Optimal Input Design for Feed-forward Control

Per Hägg, Bo Wahlberg

This paper considers optimal input design when the intended use of the identified model is to construct a feed-forward controller based on measurable disturbances. The objective is to find a minimum power excitation signal to be used in system identification experiment, such that the corresponding model-based feed-forward controller guarantees, with a given probability, that the variance of the output signal is within given specifications. To start with, some low order model problems are analytically solved and fundamental properties of the optimal input signal solution are presented. The optimal input signal contains feed-forward control and depends of the noise model and transfer function of the system in a specific way. Next, we show how to apply the partial correlation approach to closed loop optimal experiment design to the general feed-forward problem. A framework for optimal input signal design for feed-forward control is presented and numerically evaluated on a temperature control problem.

LGApr 4, 2023
Optimal Transport for Correctional Learning

Rebecka Winqvist, Inês Lourenco, Francesco Quinzan et al.

The contribution of this paper is a generalized formulation of correctional learning using optimal transport, which is about how to optimally transport one mass distribution to another. Correctional learning is a framework developed to enhance the accuracy of parameter estimation processes by means of a teacher-student approach. In this framework, an expert agent, referred to as the teacher, modifies the data used by a learning agent, known as the student, to improve its estimation process. The objective of the teacher is to alter the data such that the student's estimation error is minimized, subject to a fixed intervention budget. Compared to existing formulations of correctional learning, our novel optimal transport approach provides several benefits. It allows for the estimation of more complex characteristics as well as the consideration of multiple intervention policies for the teacher. We evaluate our approach on two theoretical examples, and on a human-robot interaction application in which the teacher's role is to improve the robots performance in an inverse reinforcement learning setting.

9.3SYMar 16
Path planning with moving obstacles using stochastic optimal control

Seyyed Reza Jafari, Anders Hansson, Bo Wahlberg

Navigating a collision-free and optimal trajectory for a robot is a challenging task, particularly in environments with moving obstacles such as humans. We formulate this problem as a stochastic optimal control problem. Since solving the full problem is computationally demanding, we introduce a tractable approximation whose Bellman equation can be solved efficiently. The resulting value function is then incorporated as a terminal penalty in an online rollout framework. We construct a trade-off curve between safety and performance to identify an appropriate weighting between them, and compare the performance with other methods. Simulation results show that the proposed rollout approach can be tuned to reach the target in nearly the same expected time as receding horizon $A^\star$ while maintaining a larger expected minimum distance to the moving obstacle. The results also show that the proposed method outperforms the considered CBF-based methods when a larger obstacle clearance is desired, while achieving comparable performance otherwise.

1.5OCMay 21
Performance Bounds for Rollout Policies in Stochastic Shortest Path Problems

Anders Hansson, Bo Wahlberg

This paper concerns rollout and certainty-equivalent rollout policies for stochastic shortest path problems with absorbing terminal states. The main result provides a direct non-asymptotic performance certificate for a fixed rollout policy: the loss relative to the optimal value is controlled by the uniform accuracy of the value approximation and by the expected time for which the rollout closed loop remains away from the terminal state. Thus, in the undiscounted transient setting, the expected hitting time plays the role of a discount or finite-horizon parameter in more standard approximate dynamic programming bounds. This paper also gives a performance-difference identity showing that suboptimality is exactly accumulated through the transient occupation measure, and a deterministic sharpness example showing that the hitting-time factor is unavoidable. Finally, consequences under uniform hitting-time and Foster-Lyapunov drift conditions are given, and extend the argument to certainty-equivalent rollout by adding a separate local model-mismatch term.

7.4SYApr 8
Stochastic Adaptive Control for Systems with Nonlinear Parameterization: Almost Sure Stability and Tracking

Lantian Zhang, Bo Wahlberg, Silun Zhang

This paper concerns the adaptive control problem for a class of nonlinear stochastic systems in which the state update is given by a nonlinear function of linear dynamics plus additive stochastic noise. Such systems arise in a wide range of applications, including recurrent neural networks, social dynamics, and signal processing. Despite their importance, adaptive control for these systems remains relatively unexplored in the literature. This gap is primarily due to the inherently nonconvex dependence of the system dynamics on unknown parameters, which significantly complicates both controller design and analysis. To address these challenges, we propose an online nonlinear weighted least-squares (WLS)-based parameter estimation algorithm and establish the global strong consistency of the resulting parameter estimates. In contrast to most existing results, our consistency analysis does not rely on restrictive assumptions such as persistent excitation conditions of the trajectory data, making it applicable to stochastic adaptive control settings. Building on the proposed estimator, we further develop an adaptive control algorithm with an attenuating excitation signal that can effectively combine adaptive estimation and feedback control. Finally, we are able to show that the resulting closed-loop system is globally stable and that the system trajectory can track, in a long-run average sense, the reference trajectory generated with the true system parameters. The proposed methods and theoretical results are finally validated through simulations in two nonlinear interaction network applications.

MLMar 14, 2025
Bayes and Biased Estimators Without Hyper-parameter Estimation: Comparable Performance to the Empirical-Bayes-Based Regularized Estimator

Yue Ju, Bo Wahlberg, Håkan Hjalmarsson

Regularized system identification has become a significant complement to more classical system identification. It has been numerically shown that kernel-based regularized estimators often perform better than the maximum likelihood estimator in terms of minimizing mean squared error (MSE). However, regularized estimators often require hyper-parameter estimation. This paper focuses on ridge regression and the regularized estimator by employing the empirical Bayes hyper-parameter estimator. We utilize the excess MSE to quantify the MSE difference between the empirical-Bayes-based regularized estimator and the maximum likelihood estimator for large sample sizes. We then exploit the excess MSE expressions to develop both a family of generalized Bayes estimators and a family of closed-form biased estimators. They have the same excess MSE as the empirical-Bayes-based regularized estimator but eliminate the need for hyper-parameter estimation. Moreover, we conduct numerical simulations to show that the performance of these new estimators is comparable to the empirical-Bayes-based regularized estimator, while computationally, they are more efficient.

LGNov 15, 2021
A Teacher-Student Markov Decision Process-based Framework for Online Correctional Learning

Inês Lourenço, Rebecka Winqvist, Cristian R. Rojas et al.

A classical learning setting typically concerns an agent/student who collects data, or observations, from a system in order to estimate a certain property of interest. Correctional learning is a type of cooperative teacher-student framework where a teacher, who has partial knowledge about the system, has the ability to observe and alter (correct) the observations received by the student in order to improve the accuracy of its estimate. In this paper, we show how the variance of the estimate of the student can be reduced with the help of the teacher. We formulate the corresponding online problem - where the teacher has to decide, at each time instant, whether or not to change the observations due to a limited budget - as a Markov decision process, from which the optimal policy is derived using dynamic programming. We validate the framework in numerical experiments, and compare the optimal online policy with the one from the batch setting.

ROOct 14, 2020
A Geometric Approach to On-road Motion Planning for Long and Multi-Body Heavy-Duty Vehicles

Rui Oliveira, Oskar Ljungqvist, Pedro F. Lima et al.

Driving heavy-duty vehicles, such as buses and tractor-trailer vehicles, is a difficult task in comparison to passenger cars. Most research on motion planning for autonomous vehicles has focused on passenger vehicles, and many unique challenges associated with heavy-duty vehicles remain open. However, recent works have started to tackle the particular difficulties related to on-road motion planning for buses and tractor-trailer vehicles using numerical optimization approaches. In this work, we propose a framework to design an optimization objective to be used in motion planners. Based on geometric derivations, the method finds the optimal trade-off between the conflicting objectives of centering different axles of the vehicle in the lane. For the buses, we consider the front and rear axles trade-off, whereas for articulated vehicles, we consider the tractor and trailer rear axles trade-off. Our results show that the proposed design strategy results in planned paths that considerably improve the behavior of heavy-duty vehicles by keeping the whole vehicle body in the center of the lane.

LGOct 3, 2020
Learning the Step-size Policy for the Limited-Memory Broyden-Fletcher-Goldfarb-Shanno Algorithm

Lucas N. Egidio, Anders Hansson, Bo Wahlberg

We consider the problem of how to learn a step-size policy for the Limited-Memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm. This is a limited computational memory quasi-Newton method widely used for deterministic unconstrained optimization but currently avoided in large-scale problems for requiring step sizes to be provided at each iteration. Existing methodologies for the step size selection for L-BFGS use heuristic tuning of design parameters and massive re-evaluations of the objective function and gradient to find appropriate step-lengths. We propose a neural network architecture with local information of the current iterate as the input. The step-length policy is learned from data of similar optimization problems, avoids additional evaluations of the objective function, and guarantees that the output step remains inside a pre-defined interval. The corresponding training procedure is formulated as a stochastic optimization problem using the backpropagation through time algorithm. The performance of the proposed method is evaluated on the training of classifiers for the MNIST database for handwritten digits and for CIFAR-10. The results show that the proposed algorithm outperforms heuristically tuned optimizers such as ADAM, RMSprop, L-BFGS with a backtracking line search, and L-BFGS with a constant step size. The numerical results also show that a learned policy can be used as a warm-start to train new policies for different problems after a few additional training steps, highlighting its potential use in multiple large-scale optimization problems.

LGJun 12, 2020
Task-similarity Aware Meta-learning through Nonparametric Kernel Regression

Arun Venkitaraman, Anders Hansson, Bo Wahlberg

This paper investigates the use of nonparametric kernel-regression to obtain a tasksimilarity aware meta-learning algorithm. Our hypothesis is that the use of tasksimilarity helps meta-learning when the available tasks are limited and may contain outlier/ dissimilar tasks. While existing meta-learning approaches implicitly assume the tasks as being similar, it is generally unclear how this task-similarity could be quantified and used in the learning. As a result, most popular metalearning approaches do not actively use the similarity/dissimilarity between the tasks, but rely on availability of huge number of tasks for their working. Our contribution is a novel framework for meta-learning that explicitly uses task-similarity in the form of kernels and an associated meta-learning algorithm. We model the task-specific parameters to belong to a reproducing kernel Hilbert space where the kernel function captures the similarity across tasks. The proposed algorithm iteratively learns a meta-parameter which is used to assign a task-specific descriptor for every task. The task descriptors are then used to quantify the task-similarity through the kernel function. We show how our approach conceptually generalizes the popular meta-learning approaches of model-agnostic meta-learning (MAML) and Meta-stochastic gradient descent (Meta-SGD) approaches. Numerical experiments with regression tasks show that our algorithm outperforms these approaches when the number of tasks is limited, even in the presence of outlier or dissimilar tasks. This supports our hypothesis that task-similarity helps improve the metalearning performance in task-limited and adverse settings.

MLMay 8, 2020
On Training and Evaluation of Neural Network Approaches for Model Predictive Control

Rebecka Winqvist, Arun Venkitaraman, Bo Wahlberg

The contribution of this paper is a framework for training and evaluation of Model Predictive Control (MPC) implemented using constrained neural networks. Recent studies have proposed to use neural networks with differentiable convex optimization layers to implement model predictive controllers. The motivation is to replace real-time optimization in safety critical feedback control systems with learnt mappings in the form of neural networks with optimization layers. Such mappings take as the input the state vector and predict the control law as the output. The learning takes place using training data generated from off-line MPC simulations. However, a general framework for characterization of learning approaches in terms of both model validation and efficient training data generation is lacking in literature. In this paper, we take the first steps towards developing such a coherent framework. We discuss how the learning problem has similarities with system identification, in particular input design, model structure selection and model validation. We consider the study of neural network architectures in PyTorch with the explicit MPC constraints implemented as a differentiable optimization layer using CVXPY. We propose an efficient approach of generating MPC input samples subject to the MPC model constraints using a hit-and-run sampler. The corresponding true outputs are generated by solving the MPC offline using OSOP. We propose different metrics to validate the resulting approaches. Our study further aims to explore the advantages of incorporating domain knowledge into the network structure from a training and evaluation perspective. Different model structures are numerically tested using the proposed framework in order to obtain more insights in the properties of constrained neural networks based MPC.

ROJan 19, 2020
Optimization-Based On-Road Path Planning for Articulated Vehicles

Rui Oliveira, Oskar Ljungqvist, Pedro F. Lima et al.

Maneuvering an articulated vehicle on narrow road stretches is often a challenging task for a human driver. Unless the vehicle is accurately steered, parts of the vehicle's bodies may exceed its assigned drive lane, resulting in an increased risk of collision with surrounding traffic. In this work, an optimization-based path-planning algorithm is proposed targeting on-road driving scenarios for articulated vehicles composed of a tractor and a trailer. To this end, we model the tractor-trailer vehicle in a road-aligned coordinate frame suited for on-road planning. Based on driving heuristics, a set of different optimization objectives is proposed, with the overall goal of designing a path planner that computes paths which minimize the off-track of the vehicle bodies swept area, while remaining on the road and avoiding collision with obstacles. The proposed optimization-based path-planning algorithm, together with the different optimization objectives, is evaluated and analyzed in simulations on a set of complicated and practically relevant on-road planning scenarios using the most challenging tractor-trailer dimensions.

SYDec 20, 2019
Teaching robots to perceive time -- A reinforcement learning approach (Extended version)

Inês Lourenço, Bo Wahlberg, Rodrigo Ventura

Time perception is the phenomenological experience of time by an individual. In this paper, we study how to replicate neural mechanisms involved in time perception, allowing robots to take a step towards temporal cognition. Our framework follows a twofold biologically inspired approach. The first step consists of estimating the passage of time from sensor measurements, since environmental stimuli influence the perception of time. Sensor data is modeled as Gaussian processes that represent the second-order statistics of the natural environment. The estimated elapsed time between two events is computed from the maximum likelihood estimate of the joint distribution of the data collected between them. Moreover, exactly how time is encoded in the brain remains unknown, but there is strong evidence of the involvement of dopaminergic neurons in timing mechanisms. Since their phasic activity has a similar behavior to the reward prediction error of temporal-difference learning models, the latter are used to replicate this behavior. The second step of this approach consists therefore of applying the agent's estimate of the elapsed time in a reinforcement learning problem, where a feature representation called Microstimuli is used. We validate our framework by applying it to an experiment that was originally conducted with mice, and conclude that a robot using this framework is able to reproduce the timing mechanisms of the animal's brain.

MLNov 26, 2019
Learning sparse linear dynamic networks in a hyper-parameter free setting

Arun Venkitaraman, Håkan Hjalmarsson, Bo Wahlberg

We address the issue of estimating the topology and dynamics of sparse linear dynamic networks in a hyperparameter-free setting. We propose a method to estimate the network dynamics in a computationally efficient and parameter tuning-free iterative framework known as SPICE (Sparse Iterative Covariance Estimation). The estimated dynamics directly reveal the underlying topology. Our approach does not assume that the network is undirected and is applicable even with varying noise levels across the modules of the network. We also do not assume any explicit prior knowledge on the network dynamics. Numerical experiments with realistic dynamic networks illustrate the usefulness of our method.

LGNov 26, 2019
Recursive Prediction of Graph Signals with Incoming Nodes

Arun Venkitaraman, Saikat Chatterjee, Bo Wahlberg

Kernel and linear regression have been recently explored in the prediction of graph signals as the output, given arbitrary input signals that are agnostic to the graph. In many real-world problems, the graph expands over time as new nodes get introduced. Keeping this premise in mind, we propose a method to recursively obtain the optimal prediction or regression coefficients for the recently propose Linear Regression over Graphs (LRG), as the graph expands with incoming nodes. This comes as a natural consequence of the structure C(W)= of the regression problem, and obviates the need to solve a new regression problem each time a new node is added. Experiments with real-world graph signals show that our approach results in good prediction performance which tends to be close to that obtained from knowing the entire graph apriori.

ROMay 5, 2019
Path Planning for Autonomous Bus Driving in Urban Environments

Rui Oliveira, Pedro F. Lima, Gonçalo Collares Pereira et al.

Driving in urban environments often presents difficult situations that require expert maneuvering of a vehicle. These situations become even more challenging when considering large vehicles, such as buses. We present a path planning framework that addresses the demanding driving task of buses in urban areas. The approach is formulated as an optimization problem using the road-aligned vehicle model. The road-aligned frame introduces a distortion on the vehicle body and obstacles, motivating the development of novel approximations that capture this distortion. These approximations allow for the formulation of safe and non-conservative collision avoidance constraints. Unlike other path planning approaches, our method exploits curbs and other sweepable regions, which a bus must often sweep over in order to manage certain maneuvers. Furthermore, it takes full advantage of the particular characteristics of buses, namely the overhangs, an elevated part of the vehicle chassis, that can sweep over curbs. Simulations are presented, showing the applicability and benefits of the proposed method.

SYJan 26, 2018
Trajectory Generation using Sharpness Continuous Dubins-like Paths with Applications in Control of Heavy Duty Vehicles

Rui Oliveira, Pedro F. Lima, Marcello Cirillo et al.

We present a trajectory generation framework for control of wheeled vehicles under steering actuator constraints. The motivation is smooth autonomous driving of heavy vehicles. The key idea is to take into account rate, and additionally, torque limitations of the steering actuator directly. Previous methods only take into account curvature rate limitations, which deal indirectly with steering rate limitations. We propose the new concept of Sharpness Continuous curves, which uses cubic and sigmoid curvature trajectories together with circular arcs to steer the vehicle. The obtained trajectories are characterized by a smooth and continuously differentiable steering angle profile. These trajectories provide low-level controllers with reference signals which are easier to track, resulting in improved performance. The smoothness of the obtained steering profiles also results in increased passenger comfort. The method is characterized by a fast computation time, which can be further speeded up through the use of simple pre-computations. We detail possible path planning applications of the method, and conduct simulations that show its advantages and real time capabilities.

SYJul 21, 2017
Trajectory Planning Under Vehicle Dimension Constraints Using Sequential Linear Programming

Mogens Graf Plessen, Pedro F. Lima, Jonas Martensson et al.

This paper presents a spatial-based trajectory planning method for automated vehicles under actuator, obstacle avoidance, and vehicle dimension constraints. Starting from a nonlinear kinematic bicycle model, vehicle dynamics are transformed to a road-aligned coordinate frame with path along the road centerline replacing time as the dependent variable. Space-varying vehicle dimension constraints are linearized around a reference path to pose convex optimization problems. Such constraints do not require to inflate obstacles by safety-margins and therefore maximize performance in very constrained environments. A sequential linear programming (SLP) algorithm is motivated. A linear program (LP) is solved at each SLP-iteration. The relation between LP formulation and maximum admissible traveling speeds within vehicle tire friction limits is discussed. The proposed method is evaluated in a roomy and in a tight maneuvering driving scenario, whereby a comparison to a semi-analytical clothoid-based path planner is given. Effectiveness is demonstrated particularly for very constrained environments, requiring to account for constraints and planning over the entire obstacle constellation space.

MLJul 22, 2015
Evaluation of Spectral Learning for the Identification of Hidden Markov Models

Robert Mattila, Cristian R. Rojas, Bo Wahlberg

Hidden Markov models have successfully been applied as models of discrete time series in many fields. Often, when applied in practice, the parameters of these models have to be estimated. The currently predominating identification methods, such as maximum-likelihood estimation and especially expectation-maximization, are iterative and prone to have problems with local minima. A non-iterative method employing a spectral subspace-like approach has recently been proposed in the machine learning literature. This paper evaluates the performance of this algorithm, and compares it to the performance of the expectation-maximization algorithm, on a number of numerical examples. We find that the performance is mixed; it successfully identifies some systems with relatively few available observations, but fails completely for some systems even when a large amount of observations is available. An open question is how this discrepancy can be explained. We provide some indications that it could be related to how well-conditioned some system parameters are.

SYApr 20, 2015
Approximate Regularization Paths for Nuclear Norm Minimization Using Singular Value Bounds -- With Implementation and Extended Appendix

Niclas Blomberg, Cristian R. Rojas, Bo Wahlberg

The widely used nuclear norm heuristic for rank minimization problems introduces a regularization parameter which is difficult to tune. We have recently proposed a method to approximate the regularization path, i.e., the optimal solution as a function of the parameter, which requires solving the problem only for a sparse set of points. In this paper, we extend the algorithm to provide error bounds for the singular values of the approximation. We exemplify the algorithms on large scale benchmark examples in model order reduction. Here, the order of a dynamical system is reduced by means of constrained minimization of the nuclear norm of a Hankel matrix.

STDec 1, 2014
How to monitor and mitigate stair-casing in l1 trend filtering

Cristian R. Rojas, Bo Wahlberg

In this paper we study the estimation of changing trends in time-series using $\ell_1$ trend filtering. This method generalizes 1D Total Variation (TV) denoising for detection of step changes in means to detecting changes in trends, and it relies on a convex optimization problem for which there are very efficient numerical algorithms. It is known that TV denoising suffers from the so-called stair-case effect, which leads to detecting false change points. The objective of this paper is to show that $\ell_1$ trend filtering also suffers from a certain stair-case problem. The analysis is based on an interpretation of the dual variables of the optimization problem in the method as integrated random walk. We discuss consistency conditions for $\ell_1$ trend filtering, how to monitor their fulfillment, and how to modify the algorithm to avoid the stair-case false detection problem.

SYJul 22, 2014
Approximate Regularization Path for Nuclear Norm Based H2 Model Reduction

Niclas Blomberg, Cristian R. Rojas, Bo Wahlberg

This paper concerns model reduction of dynamical systems using the nuclear norm of the Hankel matrix to make a trade-off between model fit and model complexity. This results in a convex optimization problem where this trade-off is determined by one crucial design parameter. The main contribution is a methodology to approximately calculate all solutions up to a certain tolerance to the model reduction problem as a function of the design parameter. This is called the regularization path in sparse estimation and is a very important tool in order to find the appropriate balance between fit and complexity. We extend this to the more complicated nuclear norm case. The key idea is to determine when to exactly calculate the optimal solution using an upper bound based on the so-called duality gap. Hence, by solving a fixed number of optimization problems the whole regularization path up to a given tolerance can be efficiently computed. We illustrate this approach on some numerical examples.

STJan 21, 2014
On change point detection using the fused lasso method

Cristian R. Rojas, Bo Wahlberg

In this paper we analyze the asymptotic properties of l1 penalized maximum likelihood estimation of signals with piece-wise constant mean values and/or variances. The focus is on segmentation of a non-stationary time series with respect to changes in these model parameters. This change point detection and estimation problem is also referred to as total variation denoising or l1 -mean filtering and has many important applications in most fields of science and engineering. We establish the (approximate) sparse consistency properties, including rate of convergence, of the so-called fused lasso signal approximator (FLSA). We show that this only holds if the sign of the corresponding consecutive changes are all different, and that this estimator is otherwise incapable of correctly detecting the underlying sparsity pattern. The key idea is to notice that the optimality conditions for this problem can be analyzed using techniques related to brownian bridge theory.

MLMar 8, 2012
An ADMM Algorithm for a Class of Total Variation Regularized Estimation Problems

Bo Wahlberg, Stephen Boyd, Mariette Annergren et al.

We present an alternating augmented Lagrangian method for convex optimization problems where the cost function is the sum of two terms, one that is separable in the variable blocks, and a second that is separable in the difference between consecutive variable blocks. Examples of such problems include Fused Lasso estimation, total variation denoising, and multi-period portfolio optimization with transaction costs. In each iteration of our method, the first step involves separately optimizing over each variable block, which can be carried out in parallel. The second step is not separable in the variables, but can be carried out very efficiently. We apply the algorithm to segmentation of data based on changes inmean (l_1 mean filtering) or changes in variance (l_1 variance filtering). In a numerical example, we show that our implementation is around 10000 times faster compared with the generic optimization solver SDPT3.