SYMar 1, 2013
Designing Unimodular Codes via Quadratic Optimization is not Always HardMojtaba Soltanalian, Petre Stoica
The NP-hard problem of optimizing a quadratic form over the unimodular vector set arises in radar code design scenarios as well as other active sensing and communication applications. To tackle this problem (which we call unimodular quadratic programming (UQP)), several computational approaches are devised and studied. A specialized local optimization scheme for UQP is introduced and shown to yield superior results compared to general local optimization methods. Furthermore, a \textbf{m}onotonically \textbf{er}ror-bound \textbf{i}mproving \textbf{t}echnique (MERIT) is proposed to obtain the global optimum or a local optimum of UQP with good sub-optimality guarantees. The provided sub-optimality guarantees are case-dependent and generally outperform the $π/4$ approximation guarantee of semi-definite relaxation. Several numerical examples are presented to illustrate the performance of the proposed method. The examples show that for cases including several matrix structures used in radar code design, MERIT can solve UQP efficiently in the sense of sub-optimality guarantee and computational time.
MLJan 20, 2023
Off-Policy Evaluation with Out-of-Sample GuaranteesSofia Ek, Dave Zachariah, Fredrik D. Johansson et al.
We consider the problem of evaluating the performance of a decision policy using past observational data. The outcome of a policy is measured in terms of a loss (aka. disutility or negative reward) and the main problem is making valid inferences about its out-of-sample loss when the past data was observed under a different and possibly unknown policy. Using a sample-splitting method, we show that it is possible to draw such inferences with finite-sample coverage guarantees about the entire loss distribution, rather than just its mean. Importantly, the method takes into account model misspecifications of the past policy - including unmeasured confounding. The evaluation method can be used to certify the performance of a policy using observational data under a specified range of credible model assumptions.
MLJun 22, 2022
Diagnostic Tool for Out-of-Sample Model EvaluationLudvig Hult, Dave Zachariah, Petre Stoica
Assessment of model fitness is a key part of machine learning. The standard paradigm is to learn models by minimizing a chosen loss function averaged over training data, with the aim of achieving small losses on future data. In this paper, we consider the use of a finite calibration data set to characterize the future, out-of-sample losses of a model. We propose a simple model diagnostic tool that provides finite-sample guarantees under weak assumptions. The tool is simple to compute and to interpret. Several numerical experiments are presented to show how the proposed method quantifies the impact of distribution shifts, aids the analysis of regression, and enables model selection as well as hyper-parameter tuning.
MLMay 23, 2024
Certified Inventory Control of Critical ResourcesLudvig Hult, Dave Zachariah, Petre Stoica
Inventory control is subject to service-level requirements, in which sufficient stock levels must be maintained despite an unknown demand. We propose a data-driven order policy that certifies any prescribed service level under minimal assumptions on the unknown demand process. The policy achieves this using any online learning method along with integral action. We further propose an inference method that is valid in finite samples. The properties and theoretical guarantees of the method are illustrated using both synthetic and real-world data.
SPFeb 6, 2022
Learning Sparse Graphs via Majorization-Minimization for Smooth Node SignalsGhania Fatima, Aakash Arora, Prabhu Babu et al.
In this letter, we propose an algorithm for learning a sparse weighted graph by estimating its adjacency matrix under the assumption that the observed signals vary smoothly over the nodes of the graph. The proposed algorithm is based on the principle of majorization-minimization (MM), wherein we first obtain a tight surrogate function for the graph learning objective and then solve the resultant surrogate problem which has a simple closed form solution. The proposed algorithm does not require tuning of any hyperparameter and it has the desirable feature of eliminating the inactive variables in the course of the iterations - which can help speeding up the algorithm. The numerical simulations conducted using both synthetic and real world (brain-network) data show that the proposed algorithm converges faster, in terms of the average number of iterations, than several existing methods in the literature.
STJan 21, 2022
Tuned Regularized Estimators for Linear Regression via Covariance FittingPer Mattsson, Dave Zachariah, Petre Stoica
We consider the problem of finding tuned regularized parameter estimators for linear models. We start by showing that three known optimal linear estimators belong to a wider class of estimators that can be formulated as a solution to a weighted and constrained minimization problem. The optimal weights, however, are typically unknown in many applications. This begs the question, how should we choose the weights using only the data? We propose using the covariance fitting SPICE-methodology to obtain data-adaptive weights and show that the resulting class of estimators yields tuned versions of known regularized estimators - such as ridge regression, LASSO, and regularized least absolute deviation. These theoretical results unify several important estimators under a common umbrella. The resulting tuned estimators are also shown to be practically relevant by means of a number of numerical examples.
MLOct 19, 2021
Learning Pareto-Efficient Decisions with ConfidenceSofia Ek, Dave Zachariah, Petre Stoica
The paper considers the problem of multi-objective decision support when outcomes are uncertain. We extend the concept of Pareto-efficient decisions to take into account the uncertainty of decision outcomes across varying contexts. This enables quantifying trade-offs between decisions in terms of tail outcomes that are relevant in safety-critical applications. We propose a method for learning efficient decisions with statistical confidence, building on results from the conformal prediction literature. The method adapts to weak or nonexistent context covariate overlap and its statistical guarantees are evaluated using both synthetic and real data.
MLMay 18, 2021
Robust Learning in Heterogeneous ContextsMuhammad Osama, Dave Zachariah, Petre Stoica
We consider the problem of learning from training data obtained in different contexts, where the underlying context distribution is unknown and is estimated empirically. We develop a robust method that takes into account the uncertainty of the context distribution. Unlike the conventional and overly conservative minimax approach, we focus on excess risks and construct distribution sets with statistical coverage to achieve an appropriate trade-off between performance and robustness. The proposed method is computationally scalable and shown to interpolate between empirical risk minimization and minimax regret objectives. Using both real and synthetic data, we demonstrate its ability to provide robustness in worst-case scenarios without harming performance in the nominal scenario.
ITOct 9, 2020
Robust Localization in Wireless Networks From Corrupted SignalsMuhammad Osama, Dave Zachariah, Satyam Dwivedi et al.
We address the problem of timing-based localization in wireless networks, when an unknown fraction of data is corrupted by nonideal signal conditions. While timing-based techniques enable accurate localization, they are also sensitive to such corrupted data. We develop a robust method that is applicable to a range of localization techniques, including time-of-arrival, time-difference-of-arrival and time-difference in schedule-based transmissions. The method is nonparametric and requires only an upper bound on the fraction of corrupted data, thus obviating distributional assumptions of the corrupting noise distribution. The robustness of the method is demonstrated in numerical experiments.
MLJul 3, 2020
Prediction of Spatial Point Processes: Regularized Method with Out-of-Sample GuaranteesMuhammad Osama, Dave Zachariah, Petre Stoica
A spatial point process can be characterized by an intensity function which predicts the number of events that occur across space. In this paper, we develop a method to infer predictive intensity intervals by learning a spatial model using a regularized criterion. We prove that the proposed method exhibits out-of-sample prediction performance guarantees which, unlike standard estimators, are valid even when the spatial model is misspecified. The method is demonstrated using synthetic as well as real spatial data.
SPDec 16, 2019
Robust Prediction when Features are MissingXiuming Liu, Dave Zachariah, Petre Stoica
Predictors are learned using past training data which may contain features that are unavailable at the time of prediction. We develop an approach that is robust against outlying missing features, based on the optimality properties of an oracle predictor which observes them. The robustness properties of the approach are demonstrated on both real and synthetic data.
STFeb 26, 2019
Effect Inference from Two-Group Data with Sampling BiasDave Zachariah, Petre Stoica
In many applications, different populations are compared using data that are sampled in a biased manner. Under sampling biases, standard methods that estimate the difference between the population means yield unreliable inferences. Here we develop an inference method that is resilient to sampling biases and is able to control the false positive errors under moderate bias levels in contrast to the standard approach. We demonstrate the method using synthetic and real biomarker data.
MEAug 17, 2018
Data Consistency Approach to Model ValidationAndreas Svensson, Dave Zachariah, Petre Stoica et al.
In scientific inference problems, the underlying statistical modeling assumptions have a crucial impact on the end results. There exist, however, only a few automatic means for validating these fundamental modelling assumptions. The contribution in this paper is a general criterion to evaluate the consistency of a set of statistical models with respect to observed data. This is achieved by automatically gauging the models' ability to generate data that is similar to the observed data. Importantly, the criterion follows from the model class itself and is therefore directly applicable to a broad range of inference problems with varying data types, ranging from independent univariate data to high-dimensional time-series. The proposed data consistency criterion is illustrated, evaluated and compared to several well-established methods using three synthetic and two real data sets.
STMay 19, 2017
Model-Robust Counterfactual Prediction MethodDave Zachariah, Petre Stoica
We develop a novel method for counterfactual analysis based on observational data using prediction intervals for units under different exposures. Unlike methods that target heterogeneous or conditional average treatment effects of an exposure, the proposed approach aims to take into account the irreducible dispersions of counterfactual outcomes so as to quantify the relative impact of different exposures. The prediction intervals are constructed in a distribution-free and model-robust manner based on the conformal prediction approach. The computational obstacles to this approach are circumvented by leveraging properties of a tuning-free method that learns sparse additive predictor models for counterfactual outcomes. The method is illustrated using both real and synthetic data.
LGMar 15, 2017
Online Learning for Distribution-Free PredictionDave Zachariah, Petre Stoica, Thomas B. Schön
We develop an online learning method for prediction, which is important in problems with large and/or streaming data sets. We formulate the learning approach using a covariance-fitting methodology, and show that the resulting predictor has desirable computational and distribution-free properties: It is implemented online with a runtime that scales linearly in the number of samples; has a constant memory requirement; avoids local minima problems; and prunes away redundant feature dimensions without relying on restrictive assumptions on the data distribution. In conjunction with the split conformal approach, it also produces distribution-free prediction confidence intervals in a computationally efficient manner. The method is demonstrated on both real and synthetic datasets.
MLJun 14, 2016
Recursive nonlinear-system identification using latent variablesPer Mattsson, Dave Zachariah, Petre Stoica
In this paper we develop a method for learning nonlinear systems with multiple outputs and inputs. We begin by modelling the errors of a nominal predictor of the system using a latent variable framework. Then using the maximum likelihood principle we derive a criterion for learning the model. The resulting optimization problem is tackled using a majorization-minimization approach. Finally, we develop a convex majorization technique and show that it enables a recursive identification method. The method learns parsimonious predictive models and is tested on both synthetic and real nonlinear systems.
MLJun 13, 2016
Prediction performance after learning in Gaussian process regressionJohan Wågberg, Dave Zachariah, Thomas B. Schön et al.
This paper considers the quantification of the prediction performance in Gaussian process regression. The standard approach is to base the prediction error bars on the theoretical predictive variance, which is a lower bound on the mean square-error (MSE). This approach, however, does not take into account that the statistical model is learned from the data. We show that this omission leads to a systematic underestimation of the prediction errors. Starting from a generalization of the Cramér-Rao bound, we derive a more accurate MSE bound which provides a measure of uncertainty for prediction of Gaussian processes. The improved bound is easily computed and we illustrate it using synthetic and real data examples. of uncertainty for prediction of Gaussian processes and illustrate it using synthetic and real data examples.