LGMar 24Code
A Sobering Look at Tabular Data Generation via Probabilistic CircuitsDavide Scassola, Dylan Ponsford, Adrián Javaloy et al.
Tabular data is more challenging to generate than text and images, due to its heterogeneous features and much lower sample sizes. On this task, diffusion-based models are the current state-of-the-art (SotA) model class, achieving almost perfect performance on commonly used benchmarks. In this paper, we question the perception of progress for tabular data generation. First, we highlight the limitations of current protocols to evaluate the fidelity of generated data, and advocate for alternative ones. Next, we revisit a simple baseline -- hierarchical mixture models in the form of deep probabilistic circuits (PCs) -- which delivers competitive or superior performance to SotA models for a fraction of the cost. PCs are the generative counterpart of decision forests, and as such can natively handle heterogeneous data as well as deliver tractable probabilistic generation and inference. Finally, in a rigorous empirical analysis we show that the apparent saturation of progress for SotA models is largely due to the use of inadequate metrics. As such, we highlight that there is still much to be done to generate realistic tabular data. Code available at https://github.com/april-tools/tabpc.
LGJul 13, 2022
On the Robustness of Bayesian Neural Networks to Adversarial AttacksLuca Bortolussi, Ginevra Carbone, Luca Laurenti et al.
Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, training deep learning models robust to adversarial attacks is still an open problem. In this paper, we analyse the geometry of adversarial attacks in the large-data, overparameterized limit for Bayesian Neural Networks (BNNs). We show that, in the limit, vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution, i.e., when the data lies on a lower-dimensional submanifold of the ambient space. As a direct consequence, we demonstrate that in this limit BNN posteriors are robust to gradient-based adversarial attacks. Crucially, we prove that the expected gradient of the loss with respect to the BNN posterior distribution is vanishing, even when each neural network sampled from the posterior is vulnerable to gradient-based attacks. Experimental results on the MNIST, Fashion MNIST, and half moons datasets, representing the finite data regime, with BNNs trained with Hamiltonian Monte Carlo and Variational Inference, support this line of arguments, showing that BNNs can display both high accuracy on clean data and robustness to both gradient-based and gradient-free based adversarial attacks.
SYJul 4, 2012
Hybrid performance modelling of opportunistic networksLuca Bortolussi, Vashti Galpin, Jane Hillston
We demonstrate the modelling of opportunistic networks using the process algebra stochastic HYPE. Network traffic is modelled as continuous flows, contact between nodes in the network is modelled stochastically, and instantaneous decisions are modelled as discrete events. Our model describes a network of stationary video sensors with a mobile ferry which collects data from the sensors and delivers it to the base station. We consider different mobility models and different buffer sizes for the ferries. This case study illustrates the flexibility and expressive power of stochastic HYPE. We also discuss the software that enables us to describe stochastic HYPE models and simulate them.
SYMay 31, 2016
Policy learning for time-bounded reachability in Continuous-Time Markov Decision Processes via doubly-stochastic gradient ascentEzio Bartocci, Luca Bortolussi, Tomǎš Brázdil et al.
Continuous-time Markov decision processes are an important class of models in a wide range of applications, ranging from cyber-physical systems to synthetic biology. A central problem is how to devise a policy to control the system in order to maximise the probability of satisfying a set of temporal logic specifications. Here we present a novel approach based on statistical model checking and an unbiased estimation of a functional gradient in the space of possible policies. The statistical approach has several advantages over conventional approaches based on uniformisation, as it can also be applied when the model is replaced by a black box, and does not suffer from state-space explosion. The use of a stochastic gradient to guide our search considerably improves the efficiency of learning policies. We demonstrate the method on a proof-of-principle non-linear population model, showing strong performance in a non-trivial task.
SYJan 11, 2013
Hybrid Behaviour of Markov Population ModelsLuca Bortolussi
We investigate the behaviour of population models written in Stochastic Concurrent Constraint Programming (sCCP), a stochastic extension of Concurrent Constraint Programming. In particular, we focus on models from which we can define a semantics of sCCP both in terms of Continuous Time Markov Chains (CTMC) and in terms of Stochastic Hybrid Systems, in which some populations are approximated continuously, while others are kept discrete. We will prove the correctness of the hybrid semantics from the point of view of the limiting behaviour of a sequence of models for increasing population size. More specifically, we prove that, under suitable regularity conditions, the sequence of CTMC constructed from sCCP programs for increasing population size converges to the hybrid system constructed by means of the hybrid semantics. We investigate in particular what happens for sCCP models in which some transitions are guarded by boolean predicates or in the presence of instantaneous transitions.
SYNov 4, 2022
Conformal Quantitative Predictive Monitoring of STL Requirements for Stochastic ProcessesFrancesca Cairoli, Nicola Paoletti, Luca Bortolussi
We consider the problem of predictive monitoring (PM), i.e., predicting at runtime the satisfaction of a desired property from the current system's state. Due to its relevance for runtime safety assurance and online control, PM methods need to be efficient to enable timely interventions against predicted violations, while providing correctness guarantees. We introduce \textit{quantitative predictive monitoring (QPM)}, the first PM method to support stochastic processes and rich specifications given in Signal Temporal Logic (STL). Unlike most of the existing PM techniques that predict whether or not some property $φ$ is satisfied, QPM provides a quantitative measure of satisfaction by predicting the quantitative (aka robust) STL semantics of $φ$. QPM derives prediction intervals that are highly efficient to compute and with probabilistic guarantees, in that the intervals cover with arbitrary probability the STL robustness values relative to the stochastic evolution of the system. To do so, we take a machine-learning approach and leverage recent advances in conformal inference for quantile regression, thereby avoiding expensive Monte-Carlo simulations at runtime to estimate the intervals. We also show how our monitors can be combined in a compositional manner to handle composite formulas, without retraining the predictors nor sacrificing the guarantees. We demonstrate the effectiveness and scalability of QPM over a benchmark of four discrete-time stochastic processes with varying degrees of complexity.
AIMay 9, 2022
Graph Neural Networks for Propositional Model CountingGaia Saveri, Luca Bortolussi
Graph Neural Networks (GNNs) have been recently leveraged to solve several logical reasoning tasks. Nevertheless, counting problems such as propositional model counting (#SAT) are still mostly approached with traditional solvers. Here we tackle this gap by presenting an architecture based on the GNN framework for belief propagation (BP) of Kuch et al., extended with self-attentive GNN and trained to approximately solve the #SAT problem. We ran a thorough experimental investigation, showing that our model, trained on a small set of random Boolean formulae, is able to scale effectively to much larger problem sizes, with comparable or better performances of state of the art approximate solvers. Moreover, we show that it can be efficiently fine-tuned to provide good generalization results on different formulae distributions, such as those coming from SAT-encoded combinatorial problems.
LGAug 31, 2023
Zero-Shot Conditioning of Score-Based Diffusion Models by Neuro-Symbolic ConstraintsDavide Scassola, Sebastiano Saccani, Ginevra Carbone et al.
Score-based diffusion models have emerged as effective approaches for both conditional and unconditional generation. Still conditional generation is based on either a specific training of a conditional model or classifier guidance, which requires training a noise-dependent classifier, even when a classifier for uncorrupted data is given. We propose a method that, given a pre-trained unconditional score-based generative model, samples from the conditional distribution under arbitrary logical constraints, without requiring additional training. Differently from other zero-shot techniques, that rather aim at generating valid conditional samples, our method is designed for approximating the true conditional distribution. Firstly, we show how to manipulate the learned score in order to sample from an un-normalized distribution conditional on a user-defined constraint. Then, we define a flexible and numerically stable neuro-symbolic framework for encoding soft logical constraints. Combining these two ingredients we obtain a general, but approximate, conditional sampling algorithm. We further developed effective heuristics aimed at improving the approximation. Finally, we show the effectiveness of our approach in approximating conditional distributions for various types of constraints and data: tabular data, images and time series.
LGMay 11, 2022
Scalable Stochastic Parametric Verification with Stochastic Variational Smoothed Model CheckingLuca Bortolussi, Francesca Cairoli, Ginevra Carbone et al.
Parametric verification of linear temporal properties for stochastic models can be expressed as computing the satisfaction probability of a certain property as a function of the parameters of the model. Smoothed model checking (smMC) aims at inferring the satisfaction function over the entire parameter space from a limited set of observations obtained via simulation. As observations are costly and noisy, smMC is framed as a Bayesian inference problem so that the estimates have an additional quantification of the uncertainty. In smMC the authors use Gaussian Processes (GP), inferred by means of the Expectation Propagation algorithm. This approach provides accurate reconstructions with statistically sound quantification of the uncertainty. However, it inherits the well-known scalability issues of GP. In this paper, we exploit recent advances in probabilistic machine learning to push this limitation forward, making Bayesian inference of smMC scalable to larger datasets and enabling its application to models with high dimensional parameter spaces. We propose Stochastic Variational Smoothed Model Checking (SV-smMC), a solution that exploits stochastic variational inference (SVI) to approximate the posterior distribution of the smMC problem. The strength and flexibility of SVI make SV-smMC applicable to two alternative probabilistic models: Gaussian Processes (GP) and Bayesian Neural Networks (BNN). The core ingredient of SVI is a stochastic gradient-based optimization that makes inference easily parallelizable and that enables GPU acceleration. In this paper, we compare the performances of smMC against those of SV-smMC by looking at the scalability, the computational efficiency and the accuracy of the reconstructed satisfaction function.
CEAug 15, 2012
Proceedings First International Workshop on Hybrid Systems and BiologyEzio Bartocci, Luca Bortolussi
This volume contains the proceedings of the First International Workshop on Hybrid Systems and Biology (HSB 2012), that will be held in Newcastle upon Tyne, UK, on the 3rd September, 2012. HSB 2012 is a satellite event of the 23rd International Conference on Concurrency Theory (CONCUR 2012). This workshop aims at collecting scientists working in the area of hybrid modeling applied to systems biology, in order to discuss about current achieved goals, current challenges and future possible developments.
LGNov 6, 2025
Guided by Stars: Interpretable Concept Learning Over Time Series via Temporal Logic SemanticsIrene Ferfoglia, Simone Silvetti, Gaia Saveri et al.
Time series classification is a task of paramount importance, as this kind of data often arises in safety-critical applications. However, it is typically tackled with black-box deep learning methods, making it hard for humans to understand the rationale behind their output. To take on this challenge, we propose a novel approach, STELLE (Signal Temporal logic Embedding for Logically-grounded Learning and Explanation), a neuro-symbolic framework that unifies classification and explanation through direct embedding of trajectories into a space of temporal logic concepts. By introducing a novel STL-inspired kernel that maps raw time series to their alignment with predefined STL formulae, our model jointly optimises accuracy and interpretability, as each prediction is accompanied by the most relevant logical concepts that characterise it. This yields (i) local explanations as human-readable STL conditions justifying individual predictions, and (ii) global explanations as class-characterising formulae. Experiments demonstrate that STELLE achieves competitive accuracy while providing logically faithful explanations, validated on diverse real-world benchmarks.
CRApr 2
Empirical Evaluation of Structured Synthetic Data Privacy Metrics: Novel experimental frameworkMilton Nicolás Plasencia Palacios, Alexander Boudewijn, Sebastiano Saccani et al.
Synthetic data generation is gaining traction as a privacy enhancing technology (PET). When properly generated, synthetic data preserve the analytic utility of real data while avoiding the retention of information that would allow the identification of specific individuals. However, the concept of data privacy remains elusive, making it challenging for practitioners to evaluate and benchmark the degree of privacy protection offered by synthetic data. In this paper, we propose a framework to empirically assess the efficacy of tabular synthetic data privacy quantification methods through controlled, deliberate risk insertion. To demonstrate this framework, we survey existing approaches to synthetic data privacy quantification and the related legal theory. We then apply the framework to the main privacy quantification methods with no-box threat models on publicly available datasets.
ROMay 18
Guiding Neuro-Symbolic Scenario Generation with Spatio-Temporal LogicLorenzo Bonin, Francesco Giacomarra, Luca Bortolussi et al.
The rapid advancement of autonomous driving (AD) technologies has outpaced the development of robust safety evaluation methods. Conventional testing relies on exposing AD systems to vast numbers of real-world traffic scenes -- a brute-force approach that is prohibitively expensive and statistically ineffective at capturing the rare, safety-critical edge cases essential for validating real-world robustness. To address this fundamental limitation, we introduce STRELGen, a scalable framework for the targeted generation of safety-critical driving scenarios. STRELGen synergistically combines a multi-agent trajectory-generation diffusion model (DM) with Spatio-Temporal Logic (STREL) specifications that encode complex safety and realism properties through a highly interpretable formalism. Crucially, monitoring satisfaction levels of these specifications is differentiable, enabling gradient-based search. At inference time, we optimize directly over the DM latent space to maximize STREL formula satisfaction. The result is efficient generation of highly plausible yet safety-critical multi-agent scenarios that lie within the learned data distribution. STRELGen thus provides a flexible, interpretable, and powerful tool for stress-testing autonomous driving systems, moving beyond the limitations of brute-force data collection.
CVMay 25, 2023Code
Blending adversarial training and representation-conditional purification via aggregation improves adversarial robustnessEmanuele Ballarin, Alessio Ansuini, Luca Bortolussi
In this work, we propose a novel adversarial defence mechanism for image classification - CARSO - blending the paradigms of adversarial training and adversarial purification in a synergistic robustness-enhancing way. The method builds upon an adversarially-trained classifier, and learns to map its internal representation associated with a potentially perturbed input onto a distribution of tentative clean reconstructions. Multiple samples from such distribution are classified by the same adversarially-trained model, and a carefully chosen aggregation of its outputs finally constitutes the robust prediction of interest. Experimental evaluation by a well-established benchmark of strong adaptive attacks, across different image datasets, shows that CARSO is able to defend itself against adaptive end-to-end white-box attacks devised for stochastic defences. Paying a modest clean accuracy toll, our method improves by a significant margin the state-of-the-art for Cifar-10, Cifar-100, and TinyImageNet-200 $\ell_\infty$ robust classification accuracy against AutoAttack. Code, and instructions to obtain pre-trained models are available at: https://github.com/emaballarin/CARSO .
LGNov 20, 2024
Effective Analog ICs Floorplanning with Relational Graph Neural Networks and Reinforcement LearningDavide Basso, Luca Bortolussi, Mirjana Videnovic-Misic et al.
Analog integrated circuit (IC) floorplanning is typically a manual process with the placement of components (devices and modules) planned by a layout engineer. This process is further complicated by the interdependence of floorplanning and routing steps, numerous electric and layout-dependent constraints, as well as the high level of customization expected in analog design. This paper presents a novel automatic floorplanning algorithm based on reinforcement learning. It is augmented by a relational graph convolutional neural network model for encoding circuit features and positional constraints. The combination of these two machine learning methods enables knowledge transfer across different circuit designs with distinct topologies and constraints, increasing the \emph{generalization ability} of the solution. Applied to $6$ industrial circuits, our approach surpassed established floorplanning techniques in terms of speed, area and half-perimeter wire length. When integrated into a \emph{procedural generator} for layout completion, overall layout time was reduced by $67.3\%$ with a $8.3\%$ mean area reduction compared to manual layout.
AIDec 4, 2023
Learning-Based Approaches to Predictive Monitoring with Conformal Statistical GuaranteesFrancesca Cairoli, Luca Bortolussi, Nicola Paoletti
This tutorial focuses on efficient methods to predictive monitoring (PM), the problem of detecting at runtime future violations of a given requirement from the current state of a system. While performing model checking at runtime would offer a precise solution to the PM problem, it is generally computationally expensive. To address this scalability issue, several lightweight approaches based on machine learning have recently been proposed. These approaches work by learning an approximate yet efficient surrogate (deep learning) model of the expensive model checker. A key challenge remains to ensure reliable predictions, especially in safety-critical applications. We review our recent work on predictive monitoring, one of the first to propose learning-based approximations for CPS verification of temporal logic specifications and the first in this context to apply conformal prediction (CP) for rigorous uncertainty quantification. These CP-based uncertainty estimators offer statistical guarantees regarding the generalization error of the learning model, and they can be used to determine unreliable predictions that should be rejected. In this tutorial, we present a general and comprehensive framework summarizing our approach to the predictive monitoring of CPSs, examining in detail several variants determined by three main dimensions: system dynamics (deterministic, non-deterministic, stochastic), state observability, and semantics of requirements' satisfaction (Boolean or quantitative).
LGMay 23, 2024
Retrieval-Augmented Mining of Temporal Logic Specifications from DataGaia Saveri, Luca Bortolussi
The integration of cyber-physical systems (CPS) into everyday life raises the critical necessity of ensuring their safety and reliability. An important step in this direction is requirement mining, i.e. inferring formally specified system properties from observed behaviors, in order to discover knowledge about the system. Signal Temporal Logic (STL) offers a concise yet expressive language for specifying requirements, particularly suited for CPS, where behaviors are typically represented as time series data. This work addresses the task of learning STL requirements from observed behaviors in a data-driven manner, focusing on binary classification, i.e. on inferring properties of the system which are able to discriminate between regular and anomalous behaviour, and that can be used both as classifiers and as monitors of the compliance of the CPS to desirable specifications. We present a novel framework that combines Bayesian Optimization (BO) and Information Retrieval (IR) techniques to simultaneously learn both the structure and the parameters of STL formulae, without restrictions on the STL grammar. Specifically, we propose a framework that leverages a dense vector database containing semantic-preserving continuous representations of millions of formulae, queried for facilitating the mining of requirements inside a BO loop. We demonstrate the effectiveness of our approach in several signal classification applications, showing its ability to extract interpretable insights from system executions and advance the state-of-the-art in requirement mining for CPS.
LGDec 13, 2024
Scaling Combinatorial Optimization Neural Improvement Heuristics with Online Search and AdaptationFederico Julian Camerota Verdù, Lorenzo Castelli, Luca Bortolussi
We introduce Limited Rollout Beam Search (LRBS), a beam search strategy for deep reinforcement learning (DRL) based combinatorial optimization improvement heuristics. Utilizing pre-trained models on the Euclidean Traveling Salesperson Problem, LRBS significantly enhances both in-distribution performance and generalization to larger problem instances, achieving optimality gaps that outperform existing improvement heuristics and narrowing the gap with state-of-the-art constructive methods. We also extend our analysis to two pickup and delivery TSP variants to validate our results. Finally, we employ our search strategy for offline and online adaptation of the pre-trained improvement policy, leading to improved search performance and surpassing recent adaptive methods for constructive heuristics.
CVOct 31, 2024
ResiDual Transformer Alignment with Spectral DecompositionLorenzo Basile, Valentino Maiorca, Luca Bortolussi et al.
When examined through the lens of their residual streams, a puzzling property emerges in transformer networks: residual contributions (e.g., attention heads) sometimes specialize in specific tasks or input attributes. In this paper, we analyze this phenomenon in vision transformers, focusing on the spectral geometry of residuals, and explore its implications for modality alignment in vision-language models. First, we link it to the intrinsically low-dimensional structure of visual head representations, zooming into their principal components and showing that they encode specialized roles across a wide variety of input data distributions. Then, we analyze the effect of head specialization in multimodal models, focusing on how improved alignment between text and specialized heads impacts zero-shot classification performance. This specialization-performance link consistently holds across diverse pre-training data, network sizes, and objectives, demonstrating a powerful new mechanism for boosting zero-shot classification through targeted alignment. Ultimately, we translate these insights into actionable terms by introducing ResiDual, a technique for spectral alignment of the residual stream. Much like panning for gold, it lets the noise from irrelevant unit principal components (i.e., attributes) wash away to amplify task-relevant ones. Remarkably, this dual perspective on modality alignment yields fine-tuning level performance on different data distributions while modelling an extremely interpretable and parameter-efficient transformation, as we extensively show on 70 pre-trained network-dataset combinations (7 models, 10 datasets).
AIMay 23, 2024
stl2vec: Semantic and Interpretable Vector Representation of Temporal LogicGaia Saveri, Laura Nenzi, Luca Bortolussi et al.
Integrating symbolic knowledge and data-driven learning algorithms is a longstanding challenge in Artificial Intelligence. Despite the recognized importance of this task, a notable gap exists due to the discreteness of symbolic representations and the continuous nature of machine-learning computations. One of the desired bridges between these two worlds would be to define semantically grounded vector representation (feature embedding) of logic formulae, thus enabling to perform continuous learning and optimization in the semantic space of formulae. We tackle this goal for knowledge expressed in Signal Temporal Logic (STL) and devise a method to compute continuous embeddings of formulae with several desirable properties: the embedding (i) is finite-dimensional, (ii) faithfully reflects the semantics of the formulae, (iii) does not require any learning but instead is defined from basic principles, (iv) is interpretable. Another significant contribution lies in demonstrating the efficacy of the approach in two tasks: learning model checking, where we predict the probability of requirements being satisfied in stochastic processes; and integrating the embeddings into a neuro-symbolic framework, to constrain the output of a deep-learning generative model to comply to a given logical specification.
LGJun 10, 2025
Diffusion-based Time Series Forecasting for Sewerage SystemsNicholas A. Pearson, Francesca Cairoli, Luca Bortolussi et al.
We introduce a novel deep learning approach that harnesses the power of generative artificial intelligence to enhance the accuracy of contextual forecasting in sewerage systems. By developing a diffusion-based model that processes multivariate time series data, our system excels at capturing complex correlations across diverse environmental signals, enabling robust predictions even during extreme weather events. To strengthen the model's reliability, we further calibrate its predictions with a conformal inference technique, tailored for probabilistic time series data, ensuring that the resulting prediction intervals are statistically reliable and cover the true target values with a desired confidence level. Our empirical tests on real sewerage system data confirm the model's exceptional capability to deliver reliable contextual predictions, maintaining accuracy even under severe weather conditions.
LGMay 21, 2025
Graph-Conditional Flow Matching for Relational Data GenerationDavide Scassola, Sebastiano Saccani, Luca Bortolussi
Data synthesis is gaining momentum as a privacy-enhancing technology. While single-table tabular data generation has seen considerable progress, current methods for multi-table data often lack the flexibility and expressiveness needed to capture complex relational structures. In particular, they struggle with long-range dependencies and complex foreign-key relationships, such as tables with multiple parent tables or multiple types of links between the same pair of tables. We propose a generative model for relational data that generates the content of a relational dataset given the graph formed by the foreign-key relationships. We do this by learning a deep generative model of the content of the whole relational database by flow matching, where the neural network trained to denoise records leverages a graph neural network to obtain information from connected records. Our method is flexible, as it can support relational datasets with complex structures, and expressive, as the generation of each record can be influenced by any other record within the same connected component. We evaluate our method on several benchmark datasets and show that it achieves state-of-the-art performance in terms of synthetic data fidelity.
AIMay 8, 2025
Enhancing Reinforcement Learning for the Floorplanning of Analog ICs with Beam SearchSandro Junior Della Rovere, Davide Basso, Luca Bortolussi et al.
The layout of analog ICs requires making complex trade-offs, while addressing device physics and variability of the circuits. This makes full automation with learning-based solutions hard to achieve. However, reinforcement learning (RL) has recently reached significant results, particularly in solving the floorplanning problem. This paper presents a hybrid method that combines RL with a beam (BS) strategy. The BS algorithm enhances the agent's inference process, allowing for the generation of flexible floorplans by accomodating various objective weightings, and addressing congestion without without the need for policy retraining or fine-tuning. Moreover, the RL agent's generalization ability stays intact, along with its efficient handling of circuit features and constraints. Experimental results show approx. 5-85% improvement in area, dead space and half-perimeter wire length compared to a standard RL application, along with higher rewards for the agent. Moreover, performance and efficiency align closely with those of existing state-of-the-art techniques.
LGFeb 19, 2025
Contrastive Learning-Based privacy metrics in Tabular Synthetic DatasetsMilton Nicolás Plasencia Palacios, Sebastiano Saccani, Gabriele Sgroi et al.
Synthetic data has garnered attention as a Privacy Enhancing Technology (PET) in sectors such as healthcare and finance. When using synthetic data in practical applications, it is important to provide protection guarantees. In the literature, two family of approaches are proposed for tabular data: on the one hand, Similarity-based methods aim at finding the level of similarity between training and synthetic data. Indeed, a privacy breach can occur if the generated data is consistently too similar or even identical to the train data. On the other hand, Attack-based methods conduce deliberate attacks on synthetic datasets. The success rates of these attacks reveal how secure the synthetic datasets are. In this paper, we introduce a contrastive method that improves privacy assessment of synthetic datasets by embedding the data in a more representative space. This overcomes obstacles surrounding the multitude of data types and attributes. It also makes the use of intuitive distance metrics possible for similarity measurements and as an attack vector. In a series of experiments with publicly available datasets, we compare the performances of similarity-based and attack-based methods, both with and without use of the contrastive learning-based embeddings. Our results show that relatively efficient, easy to implement privacy metrics can perform equally well as more advanced metrics explicitly modeling conditions for privacy referred to by the GDPR.
LGMay 17, 2024
ECATS: Explainable-by-design concept-based anomaly detection for time seriesIrene Ferfoglia, Gaia Saveri, Laura Nenzi et al.
Deep learning methods for time series have already reached excellent performances in both prediction and classification tasks, including anomaly detection. However, the complexity inherent in Cyber Physical Systems (CPS) creates a challenge when it comes to explainability methods. To overcome this inherent lack of interpretability, we propose ECATS, a concept-based neuro-symbolic architecture where concepts are represented as Signal Temporal Logic (STL) formulae. Leveraging kernel-based methods for STL, concept embeddings are learnt in an unsupervised manner through a cross-attention mechanism. The network makes class predictions through these concept embeddings, allowing for a meaningful explanation to be naturally extracted for each input. Our preliminary experiments with a simple CPS-based dataset show that our model is able to achieve great classification performance while ensuring local interpretability.
CLMar 5
Distilling Formal Logic into Neural Spaces: A Kernel Alignment Approach for Signal Temporal LogicSara Candussio, Gabriele Sarti, Gaia Saveri et al.
We introduce a framework for learning continuous neural representations of formal specifications by distilling the geometry of their semantics into a latent space. Existing approaches rely either on symbolic kernels -- which preserve behavioural semantics but are computationally prohibitive, anchor-dependent, and non-invertible -- or on syntax-based neural embeddings that fail to capture underlying structures. Our method bridges this gap: using a teacher-student setup, we distill a symbolic robustness kernel into a Transformer encoder. Unlike standard contrastive methods, we supervise the model with a continuous, kernel-weighted geometric alignment objective that penalizes errors in proportion to their semantic discrepancies. Once trained, the encoder produces embeddings in a single forward pass, effectively mimicking the kernel's logic at a fraction of its computational cost. We apply our framework to Signal Temporal Logic (STL), demonstrating that the resulting neural representations faithfully preserve the semantic similarity of STL formulae, accurately predict robustness and constraint satisfaction, and remain intrinsically invertible. Our proposed approach enables highly efficient, scalable neuro-symbolic reasoning and formula reconstruction without repeated kernel computation at runtime.
AIOct 17, 2025
Advancing Routing-Awareness in Analog ICs FloorplanningDavide Basso, Luca Bortolussi, Mirjana Videnovic-Misic et al.
The adoption of machine learning-based techniques for analog integrated circuit layout, unlike its digital counterpart, has been limited by the stringent requirements imposed by electric and problem-specific constraints, along with the interdependence of floorplanning and routing steps. In this work, we address a prevalent concern among layout engineers regarding the need for readily available routing-aware floorplanning solutions. To this extent, we develop an automatic floorplanning engine based on reinforcement learning and relational graph convolutional neural network specifically tailored to condition the floorplan generation towards more routable outcomes. A combination of increased grid resolution and precise pin information integration, along with a dynamic routing resource estimation technique, allows balancing routing and area efficiency, eventually meeting industrial standards. When analyzing the place and route effectiveness in a simulated environment, the proposed approach achieves a 13.8% reduction in dead space, a 40.6% reduction in wirelength and a 73.4% increase in routing success when compared to past learning-based state-of-the-art techniques.
LGOct 13, 2025
Beyond single-model XAI: aggregating multi-model explanations for enhanced trustworthinessIlaria Vascotto, Alex Rodriguez, Alessandro Bonaita et al.
The use of Artificial Intelligence (AI) models in real-world and high-risk applications has intensified the discussion about their trustworthiness and ethical usage, from both a technical and a legislative perspective. The field of eXplainable Artificial Intelligence (XAI) addresses this challenge by proposing explanations that bring to light the decision-making processes of complex black-box models. Despite being an essential property, the robustness of explanations is often an overlooked aspect during development: only robust explanation methods can increase the trust in the system as a whole. This paper investigates the role of robustness through the usage of a feature importance aggregation derived from multiple models ($k$-nearest neighbours, random forest and neural networks). Preliminary results showcase the potential in increasing the trustworthiness of the application, while leveraging multiple model's predictive power.
AISep 1, 2025
Conformal Predictive Monitoring for Multi-Modal ScenariosFrancesca Cairoli, Luca Bortolussi, Jyotirmoy V. Deshmukh et al.
We consider the problem of quantitative predictive monitoring (QPM) of stochastic systems, i.e., predicting at runtime the degree of satisfaction of a desired temporal logic property from the current state of the system. Since computational efficiency is key to enable timely intervention against predicted violations, several state-of-the-art QPM approaches rely on fast machine-learning surrogates to provide prediction intervals for the satisfaction values, using conformal inference to offer statistical guarantees. However, these QPM methods suffer when the monitored agent exhibits multi-modal dynamics, whereby certain modes may yield high satisfaction values while others critically violate the property. Existing QPM methods are mode-agnostic and so would yield overly conservative and uninformative intervals that lack meaningful mode-specific satisfaction information. To address this problem, we present GenQPM, a method that leverages deep generative models, specifically score-based diffusion models, to reliably approximate the probabilistic and multi-modal system dynamics without requiring explicit model access. GenQPM employs a mode classifier to partition the predicted trajectories by dynamical mode. For each mode, we then apply conformal inference to produce statistically valid, mode-specific prediction intervals. We demonstrate the effectiveness of GenQPM on a benchmark of agent navigation and autonomous driving tasks, resulting in prediction intervals that are significantly more informative (less conservative) than mode-agnostic baselines.
LGAug 5, 2025
Towards Interpretable Concept Learning over Time Series via Temporal Logic SemanticsIrene Ferfoglia, Simone Silvetti, Gaia Saveri et al.
Time series classification is a task of paramount importance, as this kind of data often arises in safety-critical applications. However, it is typically tackled with black-box deep learning methods, making it hard for humans to understand the rationale behind their output. To take on this challenge, we propose a neuro-symbolic framework that unifies classification and explanation through direct embedding of trajectories into a space of Signal Temporal Logic (STL) concepts. By introducing a novel STL-inspired kernel that maps raw time series to their alignment with predefined STL formulae, our model jointly optimises for accuracy and interpretability, as each prediction is accompanied by the most relevant logical concepts that characterise it. This enables classification grounded in human-interpretable temporal patterns and produces both local and global symbolic explanations. Early results show competitive performance while offering high-quality logical justifications for model decisions.
LGJul 23, 2025
CoCAI: Copula-based Conformal Anomaly Identification for Multivariate Time-SeriesNicholas A. Pearson, Francesca Zanello, Davide Russo et al.
We propose a novel framework that harnesses the power of generative artificial intelligence and copula-based modeling to address two critical challenges in multivariate time-series analysis: delivering accurate predictions and enabling robust anomaly detection. Our method, Copula-based Conformal Anomaly Identification for Multivariate Time-Series (CoCAI), leverages a diffusion-based model to capture complex dependencies within the data, enabling high quality forecasting. The model's outputs are further calibrated using a conformal prediction technique, yielding predictive regions which are statistically valid, i.e., cover the true target values with a desired confidence level. Starting from these calibrated forecasts, robust outlier detection is performed by combining dimensionality reduction techniques with copula-based modeling, providing a statistically grounded anomaly score. CoCAI benefits from an offline calibration phase that allows for minimal overhead during deployment and delivers actionable results rooted in established theoretical foundations. Empirical tests conducted on real operational data derived from water distribution and sewerage systems confirm CoCAI's effectiveness in accurately forecasting target sequences of data and in identifying anomalous segments within them.
LGJul 13, 2025
Assessing reliability of explanations in unbalanced datasets: a use-case on the occurrence of frost eventsIlaria Vascotto, Valentina Blasone, Alex Rodriguez et al.
The usage of eXplainable Artificial Intelligence (XAI) methods has become essential in practical applications, given the increasing deployment of Artificial Intelligence (AI) models and the legislative requirements put forward in the latest years. A fundamental but often underestimated aspect of the explanations is their robustness, a key property that should be satisfied in order to trust the explanations. In this study, we provide some preliminary insights on evaluating the reliability of explanations in the specific case of unbalanced datasets, which are very frequent in high-risk use-cases, but at the same time considerably challenging for both AI models and XAI methods. We propose a simple evaluation focused on the minority class (i.e. the less frequent one) that leverages on-manifold generation of neighbours, explanation aggregation and a metric to test explanation consistency. We present a use-case based on a tabular dataset with numerical features focusing on the occurrence of frost events.
CLJul 10, 2025
Bridging Logic and Learning: Decoding Temporal Logic Embeddings via TransformersSara Candussio, Gaia Saveri, Gabriele Sarti et al.
Continuous representations of logic formulae allow us to integrate symbolic knowledge into data-driven learning algorithms. If such embeddings are semantically consistent, i.e. if similar specifications are mapped into nearby vectors, they enable continuous learning and optimization directly in the semantic space of formulae. However, to translate the optimal continuous representation into a concrete requirement, such embeddings must be invertible. We tackle this issue by training a Transformer-based decoder-only model to invert semantic embeddings of Signal Temporal Logic (STL) formulae. STL is a powerful formalism that allows us to describe properties of signals varying over time in an expressive yet concise way. By constructing a small vocabulary from STL syntax, we demonstrate that our proposed model is able to generate valid formulae after only 1 epoch and to generalize to the semantics of the logic in about 10 epochs. Additionally, the model is able to decode a given embedding into formulae that are often simpler in terms of length and nesting while remaining semantically close (or equivalent) to gold references. We show the effectiveness of our methodology across various levels of training formulae complexity to assess the impact of training data on the model's ability to effectively capture the semantic information contained in the embeddings and generalize out-of-distribution. Finally, we deploy our model for solving a requirement mining task, i.e. inferring STL specifications that solve a classification task on trajectories, performing the optimization directly in the semantic space.
LGOct 12, 2024
Timeseria: an object-oriented time series processing libraryStefano Alberto Russo, Giuliano Taffoni, Luca Bortolussi
Timeseria is an object-oriented time series processing library implemented in Python, which aims at making it easier to manipulate time series data and to build statistical and machine learning models on top of it. Unlike common data analysis frameworks, it builds up from well defined and reusable logical units (objects), which can be easily combined together in order to ensure a high level of consistency. Thanks to this approach, Timeseria can address by design several non-trivial issues which are often underestimated, such as handling data losses, non-uniform sampling rates, differences between aggregated data and punctual observations, time zones, daylight saving times, and more. Timeseria comes with a comprehensive set of base data structures, data transformations for resampling and aggregation, common data manipulation operations, and extensible models for data reconstruction, forecasting and anomaly detection. It also integrates a fully featured, interactive plotting engine capable of handling even millions of data points.
LGJun 22, 2024
Intrinsic Dimension Correlation: uncovering nonlinear connections in multimodal representationsLorenzo Basile, Santiago Acevedo, Luca Bortolussi et al.
To gain insight into the mechanisms behind machine learning methods, it is crucial to establish connections among the features describing data points. However, these correlations often exhibit a high-dimensional and strongly nonlinear nature, which makes them challenging to detect using standard methods. This paper exploits the entanglement between intrinsic dimensionality and correlation to propose a metric that quantifies the (potentially nonlinear) correlation between high-dimensional manifolds. We first validate our method on synthetic data in controlled environments, showcasing its advantages and drawbacks compared to existing techniques. Subsequently, we extend our analysis to large-scale applications in neural network representations. Specifically, we focus on latent representations of multimodal data, uncovering clear correlations between paired visual and textual embeddings, whereas existing methods struggle significantly in detecting similarity. Our results indicate the presence of highly nonlinear correlation patterns between latent manifolds.
LGJun 20, 2024
When Can You Trust Your Explanations? A Robustness Analysis on Feature ImportancesIlaria Vascotto, Alex Rodriguez, Alessandro Bonaita et al.
Recent legislative regulations have underlined the need for accountable and transparent artificial intelligence systems and have contributed to a growing interest in the Explainable Artificial Intelligence (XAI) field. Nonetheless, the lack of standardized criteria to validate explanation methodologies remains a major obstacle to developing trustworthy systems. We address a crucial yet often overlooked aspect of XAI, the robustness of explanations, which plays a central role in ensuring trust in both the system and the provided explanation. To this end, we propose a novel approach to analyse the robustness of neural network explanations to non-adversarial perturbations, leveraging the manifold hypothesis to produce new perturbed datapoints that resemble the observed data distribution. We additionally present an ensemble method to aggregate various explanations, showing how merging explanations can be beneficial for both understanding the model's decision and evaluating the robustness. The aim of our work is to provide practitioners with a framework for evaluating the trustworthiness of model explanations. Experimental results on feature importances derived from neural networks applied to tabular datasets highlight the importance of robust explanations in practical applications.
LGMay 24, 2023
Frequency maps reveal the correlation between Adversarial Attacks and Implicit BiasLorenzo Basile, Nikos Karantzas, Alberto d'Onofrio et al.
Despite their impressive performance in classification tasks, neural networks are known to be vulnerable to adversarial attacks, subtle perturbations of the input data designed to deceive the model. In this work, we investigate the correlation between these perturbations and the implicit bias of neural networks trained with gradient-based algorithms. To this end, we analyse a representation of the network's implicit bias through the lens of the Fourier transform. Specifically, we identify unique fingerprints of implicit bias and adversarial attacks by calculating the minimal, essential frequencies needed for accurate classification of each image, as well as the frequencies that drive misclassification in its adversarially perturbed counterpart. This approach enables us to uncover and analyse the correlation between these essential frequencies, providing a precise map of how the network's biases align or contrast with the frequency components exploited by adversarial attacks. To this end, among other methods, we use a newly introduced technique capable of detecting nonlinear correlations between high-dimensional datasets. Our results provide empirical evidence that the network bias in Fourier space and the target frequencies of adversarial attacks are highly correlated and suggest new potential strategies for adversarial defence.
AIMay 3, 2023
Towards Invertible Semantic-Preserving Embeddings of Logical FormulaeGaia Saveri, Luca Bortolussi
Logic is the main formal language to perform automated reasoning, and it is further a human-interpretable language, at least for small formulae. Learning and optimising logic requirements and rules has always been an important problem in Artificial Intelligence. State of the art Machine Learning (ML) approaches are mostly based on gradient descent optimisation in continuous spaces, while learning logic is framed in the discrete syntactic space of formulae. Using continuous optimisation to learn logic properties is a challenging problem, requiring to embed formulae in a continuous space in a meaningful way, i.e. preserving the semantics. Current methods are able to construct effective semantic-preserving embeddings via kernel methods (for linear temporal logic), but the map they define is not invertible. In this work we address this problem, learning how to invert such an embedding leveraging deep architectures based on the Graph Variational Autoencoder framework. We propose a novel model specifically designed for this setting, justifying our design choices through an extensive experimental evaluation. Reported results in the context of propositional logic are promising, and several challenges regarding learning invertible embeddings of formulae are highlighted and addressed.
LOJan 24, 2022
Learning Model Checking and the Kernel Trick for Signal Temporal Logic on Stochastic ProcessesLuca Bortolussi, Giuseppe Maria Gallo, Jan Křetínský et al.
We introduce a similarity function on formulae of signal temporal logic (STL). It comes in the form of a kernel function, well known in machine learning as a conceptually and computationally efficient tool. The corresponding kernel trick allows us to circumvent the complicated process of feature extraction, i.e. the (typically manual) effort to identify the decisive properties of formulae so that learning can be applied. We demonstrate this consequence and its advantages on the task of predicting (quantitative) satisfaction of STL formulae on stochastic processes: Using our kernel and the kernel trick, we learn (i) computationally efficiently (ii) a practically precise predictor of satisfaction, (iii) avoiding the difficult task of finding a way to explicitly turn formulae into vectors of numbers in a sensible way. We back the high precision we have achieved in the experiments by a theoretically sound PAC guarantee, ensuring our procedure efficiently delivers a close-to-optimal predictor.
LGAug 16, 2021
Neural Predictive Monitoring under Partial ObservabilityFrancesca Cairoli, Luca Bortolussi, Nicola Paoletti
We consider the problem of predictive monitoring (PM), i.e., predicting at runtime future violations of a system from the current state. We work under the most realistic settings where only partial and noisy observations of the state are available at runtime. Such settings directly affect the accuracy and reliability of the reachability predictions, jeopardizing the safety of the system. In this work, we present a learning-based method for PM that produces accurate and reliable reachability predictions despite partial observability (PO). We build on Neural Predictive Monitoring (NPM), a PM method that uses deep neural networks for approximating hybrid systems reachability, and extend it to the PO case. We propose and compare two solutions, an end-to-end approach, which directly operates on the rough observations, and a two-step approach, which introduces an intermediate state estimation step. Both solutions rely on conformal prediction to provide 1) probabilistic guarantees in the form of prediction regions and 2) sound estimates of predictive uncertainty. We use the latter to identify unreliable (and likely erroneous) predictions and to retrain and improve the monitors on these uncertain inputs (i.e., active learning). Our method results in highly accurate reachability predictions and error detection, as well as tight prediction regions with guaranteed coverage.
LGJun 24, 2021
Abstraction of Markov Population Dynamics via Generative Adversarial NetsFrancesca Cairoli, Ginevra Carbone, Luca Bortolussi
Markov Population Models are a widespread formalism used to model the dynamics of complex systems, with applications in Systems Biology and many other fields. The associated Markov stochastic process in continuous time is often analyzed by simulation, which can be costly for large or stiff systems, particularly when a massive number of simulations has to be performed (e.g. in a multi-scale model). A strategy to reduce computational load is to abstract the population model, replacing it with a simpler stochastic model, faster to simulate. Here we pursue this idea, building on previous works and constructing a generator capable of producing stochastic trajectories in continuous space and discrete time. This generator is learned automatically from simulations of the original model in a Generative Adversarial setting. Compared to previous works, which rely on deep neural networks and Dirichlet processes, we explore the use of state of the art generative models, which are flexible enough to learn a full trajectory rather than a single transition kernel.
MLMay 3, 2021
Abstraction-Guided Truncations for Stationary Distributions of Markov Population ModelsMichael Backenköhler, Luca Bortolussi, Gerrit Großmann et al.
To understand the long-run behavior of Markov population models, the computation of the stationary distribution is often a crucial part. We propose a truncation-based approximation that employs a state-space lumping scheme, aggregating states in a grid structure. The resulting approximate stationary distribution is used to iteratively refine relevant and truncate irrelevant parts of the state-space. This way, the algorithm learns a well-justified finite-state projection tailored to the stationary behavior. We demonstrate the method's applicability to a wide range of non-linear problems with complex stationary behaviors.
LGFeb 22, 2021
Resilience of Bayesian Layer-Wise Explanations under Adversarial AttacksGinevra Carbone, Guido Sanguinetti, Luca Bortolussi
We consider the problem of the stability of saliency-based explanations of Neural Network predictions under adversarial attacks in a classification task. Saliency interpretations of deterministic Neural Networks are remarkably brittle even when the attacks fail, i.e. for attacks that do not change the classification label. We empirically show that interpretations provided by Bayesian Neural Networks are considerably more stable under adversarial perturbations of the inputs and even under direct attacks to the explanations. By leveraging recent results, we also provide a theoretical explanation of this result in terms of the geometry of the data manifold. Additionally, we discuss the stability of the interpretations of high level representations of the inputs in the internal layers of a Network. Our results demonstrate that Bayesian methods, in addition to being more robust to adversarial attacks, have the potential to provide more stable and interpretable assessments of Neural Network predictions.
LGFeb 18, 2021
Random Projections for Improved Adversarial RobustnessGinevra Carbone, Guido Sanguinetti, Luca Bortolussi
We propose two training techniques for improving the robustness of Neural Networks to adversarial attacks, i.e. manipulations of the inputs that are maliciously crafted to fool networks into incorrect predictions. Both methods are independent of the chosen attack and leverage random projections of the original inputs, with the purpose of exploiting both dimensionality reduction and some characteristic geometrical properties of adversarial perturbations. The first technique is called RP-Ensemble and consists of an ensemble of networks trained on multiple projected versions of the original inputs. The second one, named RP-Regularizer, adds instead a regularization term to the training objective.
LGSep 11, 2020
A kernel function for Signal Temporal Logic formulaeLuca Bortolussi, Giuseppe Maria Gallo, Laura Nenzi
We discuss how to define a kernel for Signal Temporal Logic (STL) formulae. Such a kernel allows us to embed the space of formulae into a Hilbert space, and opens up the use of kernel-based machine learning algorithms in the context of STL. We show an application of this idea to a regression problem in formula space for probabilistic models.
LGJun 23, 2020
Density-embedding layers: a general framework for adaptive receptive fieldsFrancesco Cicala, Luca Bortolussi
The effectiveness and performance of artificial neural networks, particularly for visual tasks, depends in crucial ways on the receptive field of neurons. The receptive field itself depends on the interplay between several architectural aspects, including sparsity, pooling, and activation functions. In recent literature there are several ad hoc proposals trying to make receptive fields more flexible and adaptive to data. For instance, different parameterizations of convolutional and pooling layers have been proposed to increase their adaptivity. In this paper, we propose the novel theoretical framework of density-embedded layers, generalizing the transformation represented by a neuron. Specifically, the affine transformation applied on the input is replaced by a scalar product of the input, suitably represented as a piecewise constant function, with a density function associated with the neuron. This density is shown to describe directly the receptive field of the neuron. Crucially, by suitably representing such a density as a linear combination of a parametric family of functions, we can efficiently train the densities by means of any automatic differentiation system, making it adaptable to the problem at hand, and computationally efficient to evaluate. This framework captures and generalizes recent methods, allowing a fine tuning of the receptive field. In the paper, we define some novel layers and we experimentally validate them on the classic MNIST dataset.
LGFeb 11, 2020
Robustness of Bayesian Neural Networks to Gradient-Based AttacksGinevra Carbone, Matthew Wicker, Luca Laurenti et al.
Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, the problem remains open. In this paper, we analyse the geometry of adversarial attacks in the large-data, overparametrized limit for Bayesian Neural Networks (BNNs). We show that, in the limit, vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution, i.e., when the data lies on a lower-dimensional submanifold of the ambient space. As a direct consequence, we demonstrate that in the limit BNN posteriors are robust to gradient-based adversarial attacks. Experimental results on the MNIST and Fashion MNIST datasets with BNNs trained with Hamiltonian Monte Carlo and Variational Inference support this line of argument, showing that BNNs can display both high accuracy and robustness to gradient based adversarial attacks.
LGNov 8, 2018
Intrinsic Geometric Vulnerability of High-Dimensional Artificial IntelligenceLuca Bortolussi, Guido Sanguinetti
The success of modern Artificial Intelligence (AI) technologies depends critically on the ability to learn non-linear functional dependencies from large, high dimensional data sets. Despite recent high-profile successes, empirical evidence indicates that the high predictive performance is often paired with low robustness, making AI systems potentially vulnerable to adversarial attacks. In this report, we provide a simple intuitive argument suggesting that high performance and vulnerability are intrinsically coupled, and largely dependent on the geometry of typical, high-dimensional data sets. Our work highlights a major potential pitfall of modern AI systems, and suggests practical research directions to ameliorate the problem.
AINov 13, 2017
A Robust Genetic Algorithm for Learning Temporal Specifications from DataLaura Nenzi, Simone Silvetti, Ezio Bartocci et al.
We consider the problem of mining signal temporal logical requirements from a dataset of regular (good) and anomalous (bad) trajectories of a dynamical system. We assume the training set to be labeled by human experts and that we have access only to a limited amount of data, typically noisy. We provide a systematic approach to synthesize both the syntactical structure and the parameters of the temporal logic formula using a two-steps procedure: first, we leverage a novel evolutionary algorithm for learning the structure of the formula; second, we perform the parameter synthesis operating on the statistical emulation of the average robustness for a candidate formula w.r.t. its parameters. We compare our results with our previous work [{BufoBSBLB14] and with a recently proposed decision-tree [bombara_decision_2016] based method. We present experimental results on two case studies: an anomalous trajectory detection problem of a naval surveillance system and the characterization of an Ineffective Respiratory effort, showing the usefulness of our work.
MLMay 7, 2016
Matching models across abstraction levels with Gaussian ProcessesGiulio Caravagna, Luca Bortolussi, Guido Sanguinetti
Biological systems are often modelled at different levels of abstraction depending on the particular aims/resources of a study. Such different models often provide qualitatively concordant predictions over specific parametrisations, but it is generally unclear whether model predictions are quantitatively in agreement, and whether such agreement holds for different parametrisations. Here we present a generally applicable statistical machine learning methodology to automatically reconcile the predictions of different models across abstraction levels. Our approach is based on defining a correction map, a random function which modifies the output of a model in order to match the statistics of the output of a different model of the same system. We use two biological examples to give a proof-of-principle demonstration of the methodology, and discuss its advantages and potential further applications.