OCJun 23, 2019
On the Ergodic Control of EnsemblesAndre R. Fioravanti, Jakub Marecek, Robert N. Shorten et al.
Across smart-grid and smart-city application domains, there are many problems where an ensemble of agents is to be controlled such that both the aggregate behaviour and individual-level perception of the system's performance are acceptable. In many applications, traditional PI control is used to regulate aggregate ensemble performance. Our principal contribution in this note is to demonstrate that PI control may not be always suitable for this purpose, and in some situations may lead to a loss of ergodicity for closed-loop systems. Building on this observation, a theoretical framework is proposed to both analyse and design control systems for the regulation of large scale ensembles of agents with a probabilistic intent. Examples are given to illustrate our results.
AISep 3, 2022
Closed-Loop View of the Regulation of AI: Equal Impact across Repeated InteractionsQuan Zhou, Ramen Ghosh, Robert Shorten et al.
There has been much recent interest in the regulation of AI. We argue for a view based on civil-rights legislation, built on the notions of equal treatment and equal impact. In a closed-loop view of the AI system and its users, the equal treatment concerns one pass through the loop. Equal impact, in our view, concerns the long-run average behaviour across repeated interactions. In order to establish the existence of the average and its properties, one needs to study the ergodic properties of the closed-loop and its unique stationary measure.
OCJun 22, 2019
Three Formulations of the Kuramoto Model as a System of Polynomial EquationsTianran Chen, Jakub Marecek, Dhagash Mehta et al.
We compare three formulations of stationary equations of the Kuramoto model as systems of polynomial equations. In the comparison, we present bounds on the numbers of real equilibria based on the work of Bernstein, Kushnirenko, and Khovanskii, and performance of methods for the optimisation over the set of equilibria based on the work of Lasserre, both of which could be of independent interest.
LGSep 12, 2022
Fairness in Forecasting of Observations of Linear Dynamical SystemsQuan Zhou, Jakub Marecek, Robert N. Shorten
In machine learning, training data often capture the behaviour of multiple subgroups of some underlying human population. This behaviour can often be modelled as observations of an unknown dynamical system with an unobserved state. When the training data for the subgroups are not controlled carefully, however, under-representation bias arises. To counter under-representation bias, we introduce two natural notions of fairness in time-series forecasting problems: subgroup fairness and instantaneous fairness. These notions extend predictive parity to the learning of dynamical systems. We also show globally convergent methods for the fairness-constrained learning problems using hierarchies of convexifications of non-commutative polynomial optimisation problems. We also show that by exploiting sparsity in the convexifications, we can reduce the run time of our methods considerably. Our empirical results on a biased data set motivated by insurance applications and the well-known COMPAS data set demonstrate the efficacy of our methods.
OCJun 23, 2022
Stochastic Langevin Differential Inclusions with Applications to Machine LearningFabio V. Difonzo, Vyacheslav Kungurtsev, Jakub Marecek
Stochastic differential equations of Langevin-diffusion form have received significant attention, thanks to their foundational role in both Bayesian sampling algorithms and optimization in machine learning. In the latter, they serve as a conceptual model of the stochastic gradient flow in training over-parameterized models. However, the literature typically assumes smoothness of the potential, whose gradient is the drift term. Nevertheless, there are many problems for which the potential function is not continuously differentiable, and hence the drift is not Lipschitz continuous everywhere. This is exemplified by robust losses and Rectified Linear Units in regression problems. In this paper, we show some foundational results regarding the flow and asymptotic properties of Langevin-type Stochastic Differential Inclusions under assumptions appropriate to the machine-learning settings. In particular, we show strong existence of the solution, as well as an asymptotic minimization of the canonical free-energy functional.
LGJun 11, 2023
Improving the Validity of Decision Trees as ExplanationsJiri Nemecek, Tomas Pevny, Jakub Marecek
In classification and forecasting with tabular data, one often utilizes tree-based models. Those can be competitive with deep neural networks on tabular data and, under some conditions, explainable. The explainability depends on the depth of the tree and the accuracy in each leaf of the tree. We point out that decision trees containing leaves with unbalanced accuracy can provide misleading explanations. Low-accuracy leaves give less valid explanations, which could be interpreted as unfairness among subgroups utilizing these explanations. Here, we train a shallow tree with the objective of minimizing the maximum misclassification error across all leaf nodes. The shallow tree provides a global explanation, while the overall statistical performance of the shallow tree can become comparable to state-of-the-art methods (e.g., well-tuned XGBoost) by extending the leaves with further models.
OCJun 25, 2018
Transmission-Constrained Unit CommitmentClaudio Gambella, Jakub Marecek, Martin Mevissen et al.
The unit commitment with transmission constraints in the alternating-current (AC) model is a challenging mixed-integer non-linear optimisation problem. We present an approach based on decomposition of a Mixed-Integer Semidefinite Programming (MISDP) problem into a mixed-integer quadratic (MIQP) master problem and a semidefinite programming (SDP) sub-problem. Between the master problem and the sub-problem, we pass novel classes of cuts. We analyse finite convergence to the optimum of the MISDP and report promising computational results on a test case from the Canary Islands, Spain.
LGOct 5, 2023
Taming Binarized Neural Networks and Mixed-Integer ProgramsJohannes Aspman, Georgios Korpas, Jakub Marecek
There has been a great deal of recent interest in binarized neural networks, especially because of their explainability. At the same time, automatic differentiation algorithms such as backpropagation fail for binarized neural networks, which limits their applicability. By reformulating the problem of training binarized neural networks as a subadditive dual of a mixed-integer program, we show that binarized neural networks admit a tame representation. This, in turn, makes it possible to use the framework of Bolte et al. for implicit differentiation, which offers the possibility for practical implementation of backpropagation in the context of binarized neural networks. This approach could also be used for a broader class of mixed-integer programs, beyond the training of binarized neural networks, as encountered in symbolic approaches to AI and beyond.
OCNov 3, 2023
Joint Problems in Learning Multiple Dynamical SystemsMengjia Niu, Xiaoyu He, Petr Ryšavý et al.
Clustering of time series is a well-studied problem, with applications ranging from quantitative, personalized models of metabolism obtained from metabolite concentrations to state discrimination in quantum information theory. We consider a variant, where given a set of trajectories and a number of parts, we jointly partition the set of trajectories and learn linear dynamical system (LDS) models for each part, so as to minimize the maximum error across all the models. We present globally convergent methods and EM heuristics, accompanied by promising computational results. The key highlight of this method is that it does not require a predefined hidden state dimension but instead provides an upper bound. Additionally, it offers guidance for determining regularization in the system identification.
OCNov 22, 2023
Piecewise Polynomial Regression of Tame Functions via Integer ProgrammingGilles Bareilles, Johannes Aspman, Jiri Nemecek et al.
Tame functions are a class of nonsmooth, nonconvex functions, which feature in a wide range of applications: functions encountered in the training of deep neural networks with all common activations, value functions of mixed-integer programs, or wave functions of small molecules. We consider approximating tame functions with piecewise polynomial functions. We bound the quality of approximation of a tame function by a piecewise polynomial function with a given number of segments on any full-dimensional cube. We also present the first mixed-integer programming formulation of piecewise polynomial regression. Together, these can be used to estimate tame functions. We demonstrate promising computational results.
LGOct 17, 2023
Group-blind optimal transport to group parity and its constrained variantsQuan Zhou, Jakub Marecek
Fairness holds a pivotal role in the realm of machine learning, particularly when it comes to addressing groups categorised by protected attributes, e.g., gender, race. Prevailing algorithms in fair learning predominantly hinge on accessibility or estimations of these protected attributes, at least in the training process. We design a single group-blind projection map that aligns the feature distributions of both groups in the source data, achieving (demographic) group parity, without requiring values of the protected attribute for individual samples in the computation of the map, as well as its use. Instead, our approach utilises the feature distributions of the privileged and unprivileged groups in a boarder population and the essential assumption that the source data are unbiased representation of the population. We present numerical results on synthetic data and real data.
AIJul 13, 2025Code
humancompatible.interconnect: Testing Properties of Repeated Uses of Interconnections of AI SystemsRodion Nazarov, Anthony Quinn, Robert Shorten et al.
Artificial intelligence (AI) systems often interact with multiple agents. The regulation of such AI systems often requires that {\em a priori\/} guarantees of fairness and robustness be satisfied. With stochastic models of agents' responses to the outputs of AI systems, such {\em a priori\/} guarantees require non-trivial reasoning about the corresponding stochastic systems. Here, we present an open-source PyTorch-based toolkit for the use of stochastic control techniques in modelling interconnections of AI systems and properties of their repeated uses. It models robustness and fairness desiderata in a closed-loop fashion, and provides {\em a priori\/} guarantees for these interconnections. The PyTorch-based toolkit removes much of the complexity associated with the provision of fairness guarantees for closed-loop models of multi-agent systems.
AIJan 25, 2024Code
Generating Likely Counterfactuals Using Sum-Product NetworksJiri Nemecek, Tomas Pevny, Jakub Marecek
The need to explain decisions made by AI systems is driven by both recent regulation and user demand. The decisions are often explainable only post hoc. In counterfactual explanations, one may ask what constitutes the best counterfactual explanation. Clearly, multiple criteria must be taken into account, although "distance from the sample" is a key criterion. Recent methods that consider the plausibility of a counterfactual seem to sacrifice this original objective. Here, we present a system that provides high-likelihood explanations that are, at the same time, close and sparse. We show that the search for the most likely explanations satisfying many common desiderata for counterfactual explanations can be modeled using Mixed-Integer Optimization (MIO). We use a Sum-Product Network (SPN) to estimate the likelihood of a counterfactual. To achieve that, we propose an MIO formulation of an SPN, which can be of independent interest. The source code with examples is available at https://github.com/Epanemu/LiCE.
LGMar 28, 2024
Fairness in Ranking: Robustness through Randomization without the Protected AttributeAndrii Kliachkin, Eleni Psaroudaki, Jakub Marecek et al.
There has been great interest in fairness in machine learning, especially in relation to classification problems. In ranking-related problems, such as in online advertising, recommender systems, and HR automation, much work on fairness remains to be done. Two complications arise: first, the protected attribute may not be available in many applications. Second, there are multiple measures of fairness of rankings, and optimization-based methods utilizing a single measure of fairness of rankings may produce rankings that are unfair with respect to other measures. In this work, we propose a randomized method for post-processing rankings, which do not require the availability of the protected attribute. In an extensive numerical study, we show the robustness of our methods with respect to P-Fairness and effectiveness with respect to Normalized Discounted Cumulative Gain (NDCG) from the baseline ranking, improving on previously proposed methods.
LGFeb 4, 2025
Sample Complexity of Bias Detection with Subsampled Point-to-Subspace DistancesGerman Martinez Matilla, Jakub Marecek
Sample complexity of bias estimation is a lower bound on the runtime of any bias detection method. Many regulatory frameworks require the bias to be tested for all subgroups, whose number grows exponentially with the number of protected attributes. Unless one wishes to run a bias detection with a doubly-exponential run-time, one should like to have polynomial complexity of bias detection for a single subgroup. At the same time, the reference data may be based on surveys, and thus come with non-trivial uncertainty. Here, we reformulate bias detection as a point-to-subspace problem on the space of measures and show that, for supremum norm, it can be subsampled efficiently. In particular, our probabilistically approximately correct (PAC) results are corroborated by tests on well-known instances.
QUANT-PHFeb 8, 2024
Learning quantum Hamiltonians at any temperature in polynomial time with Chebyshev and bit complexityAles Wodecki, Jakub Marecek
We consider the problem of learning local quantum Hamiltonians given copies of their Gibbs state at a known inverse temperature, following Haah et al. [2108.04842] and Bakshi et al. [arXiv:2310.02243]. Our main technical contribution is a new flat polynomial approximation of the exponential function based on the Chebyshev expansion, which enables the formulation of learning quantum Hamiltonians as a polynomial optimization problem. This, in turn, can benefit from the use of moment/SOS relaxations, whose polynomial bit complexity requires careful analysis [O'Donnell, ITCS 2017]. Finally, we show that learning a $k$-local Hamiltonian, whose dual interaction graph is of bounded degree, runs in polynomial time under mild assumptions.
LGJan 26
EVEREST: An Evidential, Tail-Aware Transformer for Rare-Event Time-Series ForecastingAntanas Zilinskas, Robert N. Shorten, Jakub Marecek
Forecasting rare events in multivariate time-series data is challenging due to severe class imbalance, long-range dependencies, and distributional uncertainty. We introduce EVEREST, a transformer-based architecture for probabilistic rare-event forecasting that delivers calibrated predictions and tail-aware risk estimation, with auxiliary interpretability via attention-based signal attribution. EVEREST integrates four components: (i) a learnable attention bottleneck for soft aggregation of temporal dynamics; (ii) an evidential head for estimating aleatoric and epistemic uncertainty via a Normal--Inverse--Gamma distribution; (iii) an extreme-value head that models tail risk using a Generalized Pareto Distribution; and (iv) a lightweight precursor head for early-event detection. These modules are jointly optimized with a composite loss (focal loss, evidential NLL, and a tail-sensitive EVT penalty) and act only at training time; deployment uses a single classification head with no inference overhead (approximately 0.81M parameters). On a decade of space-weather data, EVEREST achieves state-of-the-art True Skill Statistic (TSS) of 0.973/0.970/0.966 at 24/48/72-hour horizons for C-class flares. The model is compact, efficient to train on commodity hardware, and applicable to high-stakes domains such as industrial monitoring, weather, and satellite diagnostics. Limitations include reliance on fixed-length inputs and exclusion of image-based modalities, motivating future extensions to streaming and multimodal forecasting.
AISep 29, 2025
humancompatible.detect: a Python Toolkit for Detecting Bias in AI ModelsGerman M. Matilla, Jiri Nemecek, Illia Kryvoviaz et al.
There is a strong recent emphasis on trustworthy AI. In particular, international regulations, such as the AI Act, demand that AI practitioners measure data quality on the input and estimate bias on the output of high-risk AI systems. However, there are many challenges involved, including scalability (MMD) and computability (Wasserstein-1) issues of traditional methods for estimating distances on measure spaces. Here, we present humancompatible.detect, a toolkit for bias detection that addresses these challenges. It incorporates two newly developed methods to detect and evaluate bias: maximum subgroup discrepancy (MSD) and subsampled $\ell_\infty$ distances. It has an easy-to-use API documented with multiple examples. humancompatible.detect is licensed under the Apache License, Version 2.0.
LGSep 18, 2025
Stochastic Sample Approximations of (Local) Moduli of ContinuityRodion Nazarov, Allen Gehret, Robert Shorten et al.
Modulus of local continuity is used to evaluate the robustness of neural networks and fairness of their repeated uses in closed-loop models. Here, we revisit a connection between generalized derivatives and moduli of local continuity, and present a non-uniform stochastic sample approximation for moduli of local continuity. This is of importance in studying robustness of neural networks and fairness of their repeated uses.
LGAug 19, 2025
Learning Time-Varying Convexifications of Multiple Fairness MeasuresQuan Zhou, Jakub Marecek, Robert Shorten
There is an increasing appreciation that one may need to consider multiple measures of fairness, e.g., considering multiple group and individual fairness notions. The relative weights of the fairness regularisers are a priori unknown, may be time varying, and need to be learned on the fly. We consider the learning of time-varying convexifications of multiple fairness measures with limited graph-structured feedback.
LGOct 21, 2024
ExDBN: Learning Dynamic Bayesian Networks using Extended Mixed-Integer Programming FormulationsPavel Rytir, Ales Wodecki, Georgios Korpas et al.
Causal learning from data has received much attention recently. Bayesian networks can be used to capture causal relationships. There, one recovers a weighted directed acyclic graph in which random variables are represented by vertices, and the weights associated with each edge represent the strengths of the causal relationships between them. This concept is extended to capture dynamic effects by introducing a dependency on past data, which may be captured by the structural equation model. This formalism is utilized in the present contribution to propose a score-based learning algorithm. A mixed-integer quadratic program is formulated and an algorithmic solution proposed, in which the pre-generation of exponentially many acyclicity constraints is avoided by utilizing the so-called branch-and-cut (``lazy constraint'') method. Comparing the novel approach to the state-of-the-art, we show that the proposed approach turns out to produce more accurate results when applied to small and medium-sized synthetic instances containing up to 80 time series. Lastly, two interesting applications in bioscience and finance, to which the method is directly applied, further stress the importance of developing highly accurate, globally convergent solvers that can handle instances of modest size.
LGJun 21, 2024
ExDAG: an MIQP Algorithm for Learning DAGsPavel Rytir, Ales Wodecki, Jakub Marecek
There has been a growing interest in causal learning in recent years. Commonly used representations of causal structures, including Bayesian networks and structural equation models (SEM), take the form of directed acyclic graphs (DAGs). We provide a novel mixed-integer quadratic programming formulation and an associated algorithm that identifies DAGs with a low structural Hamming distance between the identified DAG and the ground truth, under identifiability assumptions. The eventual exact learning is guaranteed by the global convergence of the branch-and-bound-and-cut algorithm, which is utilized. In addition to this, integer programming techniques give us access to the dual bound, which allows for a real time assessment of the quality of solution. Previously, integer programming techniques have been shown to lead to limited scaling in the case of DAG identification due to the super exponential number of constraints, which prevent the formation of cycles. The algorithm proposed circumvents this by selectively generating only the violated constraints using the so-called "lazy" constraints methodology. Our empirical results show that ExDAG outperforms state-of-the-art solvers in terms of structural Hamming distance and $F_1$ score when considering Gaussian noise on medium-sized graphs.
QUANT-PHMar 31, 2022
Quantum open system identification via global optimization: Optimally accurate Markovian models of open systems from time-series dataZakhar Popovych, Kurt Jacobs, Georgios Korpas et al.
Accurate models of the dynamics of quantum circuits are essential for optimizing and advancing quantum devices. Since first-principles models of environmental noise and dissipation in real quantum systems are often unavailable, deriving accurate models from measured time-series data is critical. However, identifying open quantum systems poses significant challenges: powerful methods from systems engineering can perform poorly beyond weak damping (as we show) because they fail to incorporate essential constraints required for quantum evolution (e.g., positivity). Common methods that can include these constraints are typically multi-step, fitting linear models to physically grounded master equations, often resulting in non-convex functions in which local optimization algorithms get stuck in local extrema (as we show). In this work, we solve these problems by formulating quantum system identification directly from data as a polynomial optimization problem, enabling the use of recently developed global optimization methods. These methods are essentially guaranteed to reach global optima, allowing us for the first time to efficiently obtain the most accurate Markovian model for a given system. In addition to its practical importance, this allows us to take the error of these Markovian models as an alternative (operational) measure of the non-Markovianity of a system. We test our method with the spin-boson model -- a two-level system coupled to a bath of harmonic oscillators -- for which we obtain the exact evolution using matrix-product-state techniques. We show that polynomial optimization using moment/sum-of-squares approaches significantly outperforms traditional optimization algorithms, and we show that even for strong damping Lindblad-form master equations can provide accurate models of the spin-boson system.
DSNov 19, 2021
Randomized Algorithms for Monotone Submodular Function Maximization on the Integer LatticeAlberto Schiabel, Vyacheslav Kungurtsev, Jakub Marecek
Optimization problems with set submodular objective functions have many real-world applications. In discrete scenarios, where the same item can be selected more than once, the domain is generalized from a 2-element set to a bounded integer lattice. In this work, we consider the problem of maximizing a monotone submodular function on the bounded integer lattice subject to a cardinality constraint. In particular, we focus on maximizing DR-submodular functions, i.e., functions defined on the integer lattice that exhibit the diminishing returns property. Given any epsilon > 0, we present a randomized algorithm with probabilistic guarantees of O(1 - 1/e - epsilon) approximation, using a framework inspired by a Stochastic Greedy algorithm developed for set submodular functions by Mirzasoleiman et al. We then show that, on synthetic DR-submodular functions, applying our proposed algorithm on the integer lattice is faster than the alternatives, including reducing a target problem to the set domain and then applying the fastest known set submodular maximization algorithm.
OCOct 6, 2021
Predictability and Fairness in Load Aggregation and Operations of Virtual Power PlantsJakub Marecek, Michal Roubalik, Ramen Ghosh et al.
In power systems, one wishes to regulate the aggregate demand of an ensemble of distributed energy resources (DERs), such as controllable loads and battery energy storage systems. We suggest a notion of predictability and fairness, which suggests that the long-term averages of prices or incentives offered should be independent of the initial states of the operators of the DER, the aggregator, and the power grid. We show that this notion cannot be guaranteed with many traditional controllers used by the load aggregator, including the usual proportional-integral (PI) controller. We show that even considering the non-linearity of the alternating-current model, this notion of predictability and fairness can be guaranteed for incrementally input-to-state stable (iISS) controllers, under mild assumptions.
AIJun 4, 2021
Subgroup Fairness in Two-Sided MarketsQuan Zhou, Jakub Marecek, Robert N. Shorten
It is well known that two-sided markets are unfair in a number of ways. For instance, female workers at Uber earn less than their male colleagues per mile driven. Similar observations have been made for other minority subgroups in other two-sided markets. Here, we suggest a novel market-clearing mechanism for two-sided markets, which promotes equalisation of the pay per hour worked across multiple subgroups, as well as within each subgroup. In the process, we introduce a novel notion of subgroup fairness (which we call Inter-fairness), which can be combined with other notions of fairness within each subgroup (called Intra-fairness), and the utility for the customers (Customer-Care) in the objective of the market-clearing problem. While the novel non-linear terms in the objective complicate market clearing by making the problem non-convex, we show that a certain non-convex augmented Lagrangian relaxation can be approximated to any precision in time polynomial in the number of market participants using semi-definite programming. This makes it possible to implement the market-clearing mechanism efficiently. On the example of driver-ride assignment in an Uber-like system, we demonstrate the efficacy and scalability of the approach, and trade-offs between Inter- and Intra-fairness.
OCMay 19, 2021
Trilevel and Multilevel Optimization using Monotone Operator TheoryAllahkaram Shafiei, Vyacheslav Kungurtsev, Jakub Marecek
We consider rather a general class of multi-level optimization problems, where a convex objective function is to be minimized subject to constraints of optimality of nested convex optimization problems. As a special case, we consider a trilevel optimization problem, where the objective of the two lower layers consists of a sum of a smooth and a non-smooth term.~Based on fixed-point theory and related arguments, we present a natural first-order algorithm and analyze its convergence and rates of convergence in several regimes of parameters.
OCJan 2, 2021
A space-indexed formulation of packing boxes into a larger boxSam D. Allen, Edmund K. Burke, Jakub Marecek
Current integer programming solvers fail to decide whether 12 unit cubes can be packed into a 1x1x11 box within an hour using the natural relaxation of Chen/Padberg. We present an alternative relaxation of the problem of packing boxes into a larger box, which makes it possible to solve much larger instances.
SOC-PHNov 1, 2020
Screening for an Infectious Disease as a Problem in Stochastic ControlJakub Marecek
There has been much recent interest in screening populations for an infectious disease. Here, we present a stochastic-control model, wherein the optimum screening policy is provably difficult to find, but wherein Thompson sampling has provably optimal performance guarantees in the form of Bayesian regret. Thompson sampling seems applicable especially to diseases, for which we do not understand the dynamics well, such as to the super-spreading COVID-19.
SPJul 31, 2020
Predictability and Fairness in Social SensingRamen Ghosh, Jakub Marecek, Wynita M. Griggs et al.
We consider the design of distributed algorithms that govern the manner in which agents contribute to a social sensing platform. Specifically, we are interested in situations where fairness among the agents contributing to the platform is needed. A notable example are platforms operated by public bodies, where fairness is a legal requirement. The design of such distributed systems is challenging due to the fact that we wish to simultaneously realise an efficient social sensing platform, but also deliver a predefined quality of service to the agents (for example, a fair opportunity to contribute to the platform). In this paper, we introduce iterated function systems (IFS) as a tool for the design and analysis of systems of this kind. We show how the IFS framework can be used to realise systems that deliver a predictable quality of service to agents, can be used to underpin contracts governing the interaction of agents with the social sensing platform, and which are efficient. To illustrate our design via a use case, we consider a large, high-density network of participating parked vehicles. When awoken by an administrative centre, this network proceeds to search for moving missing entities of interest using RFID-based techniques. We regulate which vehicles are actively searching for the moving entity of interest at any point in time. In doing so, we seek to equalise vehicular energy consumption across the network. This is illustrated through simulations of a search for a missing Alzheimer's patient in Melbourne, Australia. Experimental results are presented to illustrate the efficacy of our system and the predictability of access of agents to the platform independent of initial conditions.
LGJun 12, 2020
Fairness in Forecasting and Learning Linear Dynamical SystemsQuan Zhou, Jakub Marecek, Robert N. Shorten
In machine learning, training data often capture the behaviour of multiple subgroups of some underlying human population. When the amounts of training data for the subgroups are not controlled carefully, under-representation bias arises. We introduce two natural notions of subgroup fairness and instantaneous fairness to address such under-representation bias in time-series forecasting problems. In particular, we consider the subgroup-fair and instant-fair learning of a linear dynamical system (LDS) from multiple trajectories of varying lengths, and the associated forecasting problems. We provide globally convergent methods for the learning problems using hierarchies of convexifications of non-commutative polynomial optimisation problems. Our empirical results on a biased data set motivated by insurance applications and the well-known COMPAS data set demonstrate both the beneficial impact of fairness considerations on statistical performance and encouraging effects of exploiting sparsity on run time.
OCFeb 4, 2020
Learning of Linear Dynamical Systems as a Non-Commutative Polynomial Optimization ProblemQuan Zhou, Jakub Marecek
There has been much recent progress in forecasting the next observation of a linear dynamical system (LDS), which is known as the improper learning, as well as in the estimation of its system matrices, which is known as the proper learning of LDS. We present an approach to proper learning of LDS, which in spite of the non-convexity of the problem, guarantees global convergence of numerical solutions to a least-squares estimator. We present promising computational results.
CVDec 9, 2019
Deep Autoencoders with Value-at-Risk Thresholding for Unsupervised Anomaly DetectionAlbert Akhriev, Jakub Marecek
Many real-world monitoring and surveillance applications require non-trivial anomaly detection to be run in the streaming model. We consider an incremental-learning approach, wherein a deep-autoencoding (DAE) model of what is normal is trained and used to detect anomalies at the same time. In the detection of anomalies, we utilise a novel thresholding mechanism, based on value at risk (VaR). We compare the resulting convolutional neural network (CNN) against a number of subspace methods, and present results on changedetection net.
OCSep 16, 2019
On-line Non-Convex Constrained OptimizationOlivier Massicot, Jakub Marecek
Time-varying non-convex continuous-valued non-linear constrained optimization is a fundamental problem. We study conditions wherein a momentum-like regularising term allow for the tracking of local optima by considering an ordinary differential equation (ODE). We then derive an efficient algorithm based on a predictor-corrector method, to track the ODE solution.
OCJun 23, 2019
A Fine-Grained Variant of the Hierarchy of LasserreWann-Jiun Ma, Jakub Marecek, Martin Mevissen
There has been much recent interest in hierarchies of progressively stronger convexifications of polynomial optimisation problems (POP). These often converge to the global optimum of the POP, asymptotically, but prove challenging to solve beyond the first level in the hierarchy for modest instances. We present a finer-grained variant of the Lasserre hierarchy, together with first-order methods for solving the convexifications, which allow for efficient warm-starting with solutions from lower levels in the hierarchy.
LGOct 22, 2018
Using Deep Learning to Extend the Range of Air-Pollution Monitoring and ForecastingPhilipp Haehnel, Jakub Marecek, Julien Monteil et al.
Across numerous applications, forecasting relies on numerical solvers for partial differential equations (PDEs). Although the use of deep-learning techniques has been proposed, actual applications have been restricted by the fact the training data are obtained using traditional PDE solvers. Thereby, the uses of deep-learning techniques were limited to domains, where the PDE solver was applicable. We demonstrate a deep-learning framework for air-pollution monitoring and forecasting that provides the ability to train across different model domains, as well as a reduction in the run-time by two orders of magnitude. It presents a first-of-a-kind implementation that combines deep-learning and domain-decomposition techniques to allow model deployments extend beyond the domain(s) on which the it has been trained.
STSep 16, 2018
On-Line Learning of Linear Dynamical Systems: Exponential Forgetting in Kalman FiltersMark Kozdoba, Jakub Marecek, Tigran Tchrakian et al.
Kalman filter is a key tool for time-series forecasting and analysis. We show that the dependence of a prediction of Kalman filter on the past is decaying exponentially, whenever the process noise is non-degenerate. Therefore, Kalman filter may be approximated by regression on a few recent observations. Surprisingly, we also show that having some process noise is essential for the exponential decay. With no process noise, it may happen that the forecast depends on all of the past uniformly, which makes forecasting more difficult. Based on this insight, we devise an on-line algorithm for improper learning of a linear dynamical system (LDS), which considers only a few most recent observations. We use our decay results to provide the first regret bounds w.r.t. to Kalman filters within learning an LDS. That is, we compare the results of our algorithm to the best, in hindsight, Kalman filter for a given signal. Also, the algorithm is practical: its per-update run-time is linear in the regression depth.
OCSep 10, 2018
Pursuit of Low-Rank Models of Time-Varying Matrices Robust to Sparse and Measurement NoiseAlbert Akhriev, Jakub Marecek, Andrea Simonetto
In tracking of time-varying low-rank models of time-varying matrices, we present a method robust to both uniformly-distributed measurement noise and arbitrarily-distributed ``sparse'' noise. In theory, we bound the tracking error. In practice, our use of randomised coordinate descent is scalable and allows for encouraging results on changedetection net, a benchmark.
LGAug 3, 2018
Robust Spectral Filtering and Anomaly DetectionJakub Marecek, Tigran Tchrakian
We consider a setting, where the output of a linear dynamical system (LDS) is, with an unknown but fixed probability, replaced by noise. There, we present a robust method for the prediction of the outputs of the LDS and identification of the samples of noise, and prove guarantees on its statistical performance. One application lies in anomaly detection: the samples of noise, unlikely to have been generated by the dynamics, can be flagged to operators of the system for further study.
OCApr 12, 2016
Resource Allocation with Population DynamicsJonathan Epperlein, Jakub Marecek
Many analyses of resource-allocation problems employ simplistic models of the population. Using the example of a resource-allocation problem of Marecek et al. [arXiv:1406.7639], we introduce rather a general behavioural model, where the evolution of a heterogeneous population of agents is governed by a Markov chain. Still, we are able to show that the distribution of agents across resources converges in distribution, for suitable means of information provision, under certain assumptions. The model and proof techniques may have wider applicability.
OCJan 25, 2016
Pricing Vehicle Sharing with Proximity InformationJakub Marecek, Robert Shorten, Jia Yuan Yu
For vehicle sharing schemes, where drop-off positions are not fixed, we propose a pricing scheme, where the price depends in part on the distance between where a vehicle is being dropped off and where the closest shared vehicle is parked. Under certain restrictive assumptions, we show that this pricing leads to a socially optimal spread of the vehicles within a region.
CLDec 5, 2014
Integer-Programming Ensemble of Temporal-Relations ClassifiersCatherine Kerr, Terri Hoare, Paula Carroll et al.
The extraction and understanding of temporal events and their relations are major challenges in natural language processing. Processing text on a sentence-by-sentence or expression-by-expression basis often fails, in part due to the challenge of capturing the global consistency of the text. We present an ensemble method, which reconciles the outputs of multiple classifiers of temporal expressions across the text using integer programming. Computational experiments show that the ensemble improves upon the best individual results from two recent challenges, SemEval-2013 TempEval-3 (Temporal Annotation) and SemEval-2016 Task 12 (Clinical TempEval).
OCNov 20, 2014
Optimal Power Flow as a Polynomial Optimization ProblemBissan Ghaddar, Jakub Marecek, Martin Mevissen
Formulating the alternating current optimal power flow (ACOPF) as a polynomial optimization problem makes it possible to solve large instances in practice and to guarantee asymptotic convergence in theory.
OCAug 11, 2014
Matrix Completion under Interval UncertaintyJakub Marecek, Peter Richtarik, Martin Takac
Matrix completion under interval uncertainty can be cast as matrix completion with element-wise box constraints. We present an efficient alternating-direction parallel coordinate-descent method for the problem. We show that the method outperforms any other known method on a benchmark in image in-painting in terms of signal-to-noise ratio, and that it provides high-quality solutions for an instance of collaborative filtering with 100,198,805 recommendations within 5 minutes.
OCApr 9, 2014
r-Extreme Signalling for Congestion ControlJakub Marecek, Robert Shorten, Jia Yuan Yu
In many "smart city" applications, congestion arises in part due to the nature of signals received by individuals from a central authority. In the model of Marecek et al. [arXiv:1406.7639, Int. J. Control 88(10), 2015], each agent uses one out of multiple resources at each time instant. The per-use cost of a resource depends on the number of concurrent users. A central authority has up-to-date knowledge of the congestion across all resources and uses randomisation to provide a scalar or an interval for each resource at each time. In this paper, the interval to broadcast per resource is obtained by taking the minima and maxima of costs observed within a time window of length r, rather than by randomisation. We show that the resulting distribution of agents across resources also converges in distribution, under plausible assumptions about the evolution of the population over time.