NADec 8, 2011
On the Reaction Diffusion Master Equation in the Microscopic LimitStefan Hellander, Andreas Hellander, Linda Petzold
Stochastic modeling of reaction-diffusion kinetics has emerged as a powerful theoretical tool in the study of biochemical reaction networks. Two frequently employed models are the particle-tracking Smoluchowski framework and the on-lattice Reaction-Diffusion Master Equation (RDME) framework. As the mesh size goes from coarse to fine, the RDME initially becomes more accurate. However, recent developments have shown that it will become increasingly inaccurate compared to the Smoluchowski model as the lattice spacing becomes very fine. In this paper we give a new, general and simple argument for why the RDME breaks down. Our analysis reveals a hard limit on the voxel size for which no local RDME can agree with the Smoluchowski model.
NAJan 28, 2015
Reaction rates for mesoscopic reaction-diffusion kineticsStefan Hellander, Andreas Hellander, Linda Petzold
The mesoscopic reaction-diffusion master equation (RDME) is a popular modeling framework, frequently applied to stochastic reaction-diffusion kinetics in systems biology. The RDME is derived from assumptions about the underlying physical properties of the system, and it may produce unphysical results for models where those assumptions fail. In that case, other more comprehensive models are better suited, such as hard-sphere Brownian dynamics (BD). Although the RDME is a model in its own right, and not inferred from any specific microscale model, it proves useful to attempt to approximate a microscale model by a specific choice of mesoscopic reaction rates. In this paper we derive mesoscopic reaction rates by matching certain statistics of the RDME solution to statistics of the solution of a widely used microscopic BD model: the Smoluchowski model with a mixed boundary condition at the reaction radius of two molecules. We also establish fundamental limits for the range of mesh resolutions for which this approach yields accurate results, and show both theoretically and in numerical examples that as we approach the lower fundamental limit, the mesoscopic dynamics approach the microscopic dynamics.
NAApr 21, 2008
Simulation of stochastic reaction-diffusion processes on unstructured meshesStefan Engblom, Lars Ferm, Andreas Hellander et al.
Stochastic chemical systems with diffusion are modeled with a reaction-diffusion master equation. On a macroscopic level, the governing equation is a reaction-diffusion equation for the averages of the chemical species. On a mesoscopic level, the master equation for a well stirred chemical system is combined with Brownian motion in space to obtain the reaction-diffusion master equation. The space is covered by an unstructured mesh and the diffusion coefficients on the mesoscale are obtained from a finite element discretization of the Laplace operator on the macroscale. The resulting method is a flexible hybrid algorithm in that the diffusion can be handled either on the meso- or on the macroscale level. The accuracy and the efficiency of the method are illustrated in three numerical examples inspired by molecular biology.
LGJan 23, 2023
Accelerating Fair Federated Learning: Adaptive Federated AdamLi Ju, Tianru Zhang, Salman Toor et al.
Federated learning is a distributed and privacy-preserving approach to train a statistical model collaboratively from decentralized data of different parties. However, when datasets of participants are not independent and identically distributed (non-IID), models trained by naive federated algorithms may be biased towards certain participants, and model performance across participants is non-uniform. This is known as the fairness problem in federated learning. In this paper, we formulate fairness-controlled federated learning as a dynamical multi-objective optimization problem to ensure fair performance across all participants. To solve the problem efficiently, we study the convergence and bias of Adam as the server optimizer in federated learning, and propose Adaptive Federated Adam (AdaFedAdam) to accelerate fair federated learning with alleviated bias. We validated the effectiveness, Pareto optimality and robustness of AdaFedAdam in numerical experiments and show that AdaFedAdam outperforms existing algorithms, providing better convergence and fairness properties of the federated scheme.
NADec 23, 2015
Analysis and design of jump coefficients in discrete stochastic diffusion modelsLina Meinecke, Stefan Engblom, Andreas Hellander et al.
In computational system biology, the mesoscopic model of reaction-diffusion kinetics is described by a continuous time, discrete space Markov process. To simulate diffusion stochastically, the jump coefficients are obtained by a discretization of the diffusion equation. Using unstructured meshes to represent complicated geometries may lead to negative coefficients when using piecewise linear finite elements. Several methods have been proposed to modify the coefficients to enforce the non-negativity needed in the stochastic setting. In this paper, we present a method to quantify the error introduced by that change. We interpret the modified discretization matrix as the exact finite element discretization of a perturbed equation. The forward error, the error between the analytical solutions to the original and the perturbed equations, is bounded by the backward error, the error between the diffusion of the two equations. We present a backward analysis algorithm to compute the diffusion coefficient from a given discretization matrix. The analysis suggests a new way of deriving non-negative jump coefficients that minimizes the backward error. The theory is tested in numerical experiments indicating that the new method is superior and minimizes also the forward error.
NAMar 7, 2019
Hierarchical Reaction-Diffusion Master EquationStefan Hellander, Andreas Hellander
We have developed an algorithm coupling mesoscopic simulations on different levels in a hierarchy of Cartesian meshes. Based on the multiscale nature of the chemical reactions, some molecules in the system will live on a fine-grained mesh, while others live on a coarse-grained mesh. By allowing molecules to transfer from the fine levels to the coarse levels when appropriate, we show that we can save up to three orders of magnitude of computational time compared to microscopic simulations or highly resolved mesoscopic simulations, without losing significant accuracy. We demonstrate this in several numerical examples with systems that cannot be accurately simulated with a coarse-grained mesoscopic model.
LGSep 19, 2023
Toward efficient resource utilization at edge nodes in federated learningSadi Alawadi, Addi Ait-Mlouk, Salman Toor et al.
Federated learning (FL) enables edge nodes to collaboratively contribute to constructing a global model without sharing their data. This is accomplished by devices computing local, private model updates that are then aggregated by a server. However, computational resource constraints and network communication can become a severe bottleneck for larger model sizes typical for deep learning applications. Edge nodes tend to have limited hardware resources (RAM, CPU), and the network bandwidth and reliability at the edge is a concern for scaling federated fleet applications. In this paper, we propose and evaluate a FL strategy inspired by transfer learning in order to reduce resource utilization on devices, as well as the load on the server and network in each global training round. For each local model update, we randomly select layers to train, freezing the remaining part of the model. In doing so, we can reduce both server load and communication costs per round by excluding all untrained layer weights from being transferred to the server. The goal of this study is to empirically explore the potential trade-off between resource utilization on devices and global model convergence under the proposed strategy. We implement the approach using the federated learning framework FEDn. A number of experiments were carried out over different datasets (CIFAR-10, CASA, and IMDB), performing different tasks using different deep-learning model architectures. Our results show that training the model partially can accelerate the training process, efficiently utilizes resources on-device, and reduce the data transmission by around 75% and 53% when we train 25%, and 50% of the model layers, respectively, without harming the resulting global model accuracy.
NAOct 5, 2016
Robustness analysis of spatiotemporal models in the presence of extrinsic fluctuationsAndreas Hellander, Jan Klosa, Per Lötstedt et al.
We analyze the governing partial differential equations of a model of pole-to-pole oscillations of the MinD protein in a bacterial cell. The sensitivity to extrinsic noise in the parameters of the model is explored. Our analysis shows that overall, the oscillations are robust to extrinsic perturbations in the sense that small perturbations in reaction coefficients result in small differences in the frequency and in the amplitude. However, a combination of analysis and simulation also reveals that the oscillations are more sensitive to some extrinsic time-scales than to others.
CLApr 4, 2023
FedBot: Enhancing Privacy in Chatbots with Federated LearningAddi Ait-Mlouk, Sadi Alawadi, Salman Toor et al.
Chatbots are mainly data-driven and usually based on utterances that might be sensitive. However, training deep learning models on shared data can violate user privacy. Such issues have commonly existed in chatbots since their inception. In the literature, there have been many approaches to deal with privacy, such as differential privacy and secure multi-party computation, but most of them need to have access to users' data. In this context, Federated Learning (FL) aims to protect data privacy through distributed learning methods that keep the data in its location. This paper presents Fedbot, a proof-of-concept (POC) privacy-preserving chatbot that leverages large-scale customer support data. The POC combines Deep Bidirectional Transformer models and federated learning algorithms to protect customer data privacy during collaborative model training. The results of the proof-of-concept showcase the potential for privacy-preserving chatbots to transform the customer support industry by delivering personalized and efficient customer service that meets data privacy regulations and legal requirements. Furthermore, the system is specifically designed to improve its performance and accuracy over time by leveraging its ability to learn from previous interactions.
LGJan 29
Epistemic Uncertainty Quantification for Pre-trained VLMs via Riemannian Flow MatchingLi Ju, Mayank Nautiyal, Andreas Hellander et al.
Vision-Language Models (VLMs) are typically deterministic in nature and lack intrinsic mechanisms to quantify epistemic uncertainty, which reflects the model's lack of knowledge or ignorance of its own representations. We theoretically motivate negative log-density of an embedding as a proxy for the epistemic uncertainty, where low-density regions signify model ignorance. The proposed method REPVLM computes the probability density on the hyperspherical manifold of the VLM embeddings using Riemannian Flow Matching. We empirically demonstrate that REPVLM achieves near-perfect correlation between uncertainty and prediction error, significantly outperforming existing baselines. Beyond classification, we also demonstrate that the model also provides a scalable metric for out-of-distribution detection and automated data curation.
39.8LGMay 13
GeoFlowVLM: Geometry-Aware Joint Uncertainty for Frozen Vision-Language EmbeddingMayank Nautiyal, Li Ju, Andreas Hellander et al.
Standard dual-encoder vision-language models that map images and text to deterministic points on a shared unit hypersphere through $\ell_2$ normalization typically expose neither \emph{aleatoric} uncertainty (cross-modal ambiguity) nor \emph{epistemic} uncertainty (lack of training-distribution support). Existing post-hoc methods either recover at most one of the two uncertainty components, or ignore the hyperspherical geometry of these models' embeddings. We propose \textbf{GeoFlowVLM} as a post-hoc adapter that learns the joint distribution of paired $\ell_2$-normalised dual-encoder VLM embeddings on the product hypersphere $\mathbb{S}^{d-1} \times \mathbb{S}^{d-1}$ via Riemannian flow matching with a single masked velocity field. A consistency result shows that, in the population limit, the trained network exposes the joint flow and both cross-modal conditional flows as valid Riemannian flow-matching velocity fields on their respective domains. We derive two quantities from this single model: a conditional retrieval entropy that quantifies aleatoric ambiguity with a decision-theoretic interpretation via a Fano-type bound, and a marginal-typicality epistemic score justified by an exact chain-rule decomposition of the joint NLL. This decomposition isolates a cross-modal pointwise-mutual-information term that is structurally discriminative rather than epistemic, and is empirically the only consistently uninformative standalone component. Empirically, the entropy tracks Recall@1 with near-ideal monotonic calibration across three retrieval benchmarks in both directions, and the marginal-typicality sum yields consistently calibrated selective accuracy across four zero-shot classification benchmarks.
MLJan 30
OneFlowSBI: One Model, Many Queries for Simulation-Based InferenceMayank Nautiyal, Li Ju, Melker Ernfors et al.
We introduce \textit{OneFlowSBI}, a unified framework for simulation-based inference that learns a single flow-matching generative model over the joint distribution of parameters and observations. Leveraging a query-aware masking distribution during training, the same model supports multiple inference tasks, including posterior sampling, likelihood estimation, and arbitrary conditional distributions, without task-specific retraining. We evaluate \textit{OneFlowSBI} on ten benchmark inference problems and two high-dimensional real-world inverse problems across multiple simulation budgets. \textit{OneFlowSBI} is shown to deliver competitive performance against state-of-the-art generalized inference solvers and specialized posterior estimators, while enabling efficient sampling with few ODE integration steps and remaining robust under noisy and partially observed data.
LGNov 21, 2024
Variational Autoencoders for Efficient Simulation-Based InferenceMayank Nautiyal, Andrey Shternshis, Andreas Hellander et al.
We present a generative modeling approach based on the variational inference framework for likelihood-free simulation-based inference. The method leverages latent variables within variational autoencoders to efficiently estimate complex posterior distributions arising from stochastic simulations. We explore two variations of this approach distinguished by their treatment of the prior distribution. The first model adapts the prior based on observed data using a multivariate prior network, enhancing generalization across various posterior queries. In contrast, the second model utilizes a standard Gaussian prior, offering simplicity while still effectively capturing complex posterior distributions. We demonstrate the ability of the proposed approach to approximate complex posteriors while maintaining computational efficiency on well-established benchmark problems.
CRAug 27, 2025
From Research to Reality: Feasibility of Gradient Inversion Attacks in Federated LearningViktor Valadi, Mattias Åkesson, Johan Östman et al.
Gradient inversion attacks have garnered attention for their ability to compromise privacy in federated learning. However, many studies consider attacks with the model in inference mode, where training-time behaviors like dropout are disabled and batch normalization relies on fixed statistics. In this work, we systematically analyze how architecture and training behavior affect vulnerability, including the first in-depth study of inference-mode clients, which we show dramatically simplifies inversion. To assess attack feasibility under more realistic conditions, we turn to clients operating in standard training mode. In this setting, we find that successful attacks are only possible when several architectural conditions are met simultaneously: models must be shallow and wide, use skip connections, and, critically, employ pre-activation normalization. We introduce two novel attacks against models in training-mode with varying attacker knowledge, achieving state-of-the-art performance under realistic training conditions. We extend these efforts by presenting the first attack on a production-grade object-detection model. Here, to enable any visibly identifiable leakage, we revert to the lenient inference mode setting and make multiple architectural modifications to increase model vulnerability, with the extent of required changes highlighting the strong inherent robustness of such architectures. We conclude this work by offering the first comprehensive mapping of settings, clarifying which combinations of architectural choices and operational modes meaningfully impact privacy. Our analysis provides actionable insight into when models are likely vulnerable, when they appear robust, and where subtle leakage may persist. Together, these findings reframe how gradient inversion risk should be assessed in future research and deployment scenarios.
LGMay 13, 2025
ConDiSim: Conditional Diffusion Models for Simulation Based InferenceMayank Nautiyal, Andreas Hellander, Prashant Singh
We present a conditional diffusion model - ConDiSim, for simulation-based inference of complex systems with intractable likelihoods. ConDiSim leverages denoising diffusion probabilistic models to approximate posterior distributions, consisting of a forward process that adds Gaussian noise to parameters, and a reverse process learning to denoise, conditioned on observed data. This approach effectively captures complex dependencies and multi-modalities within posteriors. ConDiSim is evaluated across ten benchmark problems and two real-world test problems, where it demonstrates effective posterior approximation accuracy while maintaining computational efficiency and stability in model training. ConDiSim offers a robust and extensible framework for simulation-based inference, particularly suitable for parameter inference workflows requiring fast inference methods.
CLFeb 9, 2022
FedQAS: Privacy-aware machine reading comprehension with federated learningAddi Ait-Mlouk, Sadi Alawadi, Salman Toor et al.
Machine reading comprehension (MRC) of text data is one important task in Natural Language Understanding. It is a complex NLP problem with a lot of ongoing research fueled by the release of the Stanford Question Answering Dataset (SQuAD) and Conversational Question Answering (CoQA). It is considered to be an effort to teach computers how to "understand" a text, and then to be able to answer questions about it using deep learning. However, until now large-scale training on private text data and knowledge sharing has been missing for this NLP task. Hence, we present FedQAS, a privacy-preserving machine reading system capable of leveraging large-scale private data without the need to pool those datasets in a central location. The proposed approach combines transformer models and federated learning technologies. The system is developed using the FEDn framework and deployed as a proof-of-concept alliance initiative. FedQAS is flexible, language-agnostic, and allows intuitive participation and execution of local model training. In addition, we present the architecture and implementation of the system, as well as provide a reference evaluation based on the SQUAD dataset, to showcase how it overcomes data privacy issues and enables knowledge sharing between alliance members in a Federated learning setting.
LGFeb 27, 2021
Scalable federated machine learning with FEDnMorgan Ekmefjord, Addi Ait-Mlouk, Sadi Alawadi et al.
Federated machine learning has great promise to overcome the input privacy challenge in machine learning. The appearance of several projects capable of simulating federated learning has led to a corresponding rapid progress on algorithmic aspects of the problem. However, there is still a lack of federated machine learning frameworks that focus on fundamental aspects such as scalability, robustness, security, and performance in a geographically distributed setting. To bridge this gap we have designed and developed the FEDn framework. A main feature of FEDn is to support both cross-device and cross-silo training settings. This makes FEDn a powerful tool for researching a wide range of machine learning applications in a realistic setting.
MLFeb 12, 2021
Robust and integrative Bayesian neural networks for likelihood-free parameter inferenceFredrik Wrede, Robin Eriksson, Richard Jiang et al.
State-of-the-art neural network-based methods for learning summary statistics have delivered promising results for simulation-based likelihood-free parameter inference. Existing approaches require density estimation as a post-processing step building upon deterministic neural networks, and do not take network prediction uncertainty into account. This work proposes a robust integrated approach that learns summary statistics using Bayesian neural networks, and directly estimates the posterior density using categorical distributions. An adaptive sampling scheme selects simulation locations to efficiently and iteratively refine the predictive posterior of the network conditioned on observations. This allows for more efficient and robust convergence on comparatively large prior spaces. We demonstrate our approach on benchmark examples and compare against related methods.
MLJan 31, 2020
Convolutional Neural Networks as Summary Statistics for Approximate Bayesian ComputationMattias Åkesson, Prashant Singh, Fredrik Wrede et al.
Approximate Bayesian Computation is widely used in systems biology for inferring parameters in stochastic gene regulatory network models. Its performance hinges critically on the ability to summarize high-dimensional system responses such as time series into a few informative, low-dimensional summary statistics. The quality of those statistics acutely impacts the accuracy of the inference task. Existing methods to select the best subset out of a pool of candidate statistics do not scale well with large pools of several tens to hundreds of candidate statistics. Since high quality statistics are imperative for good performance, this becomes a serious bottleneck when performing inference on complex and high-dimensional problems. This paper proposes a convolutional neural network architecture for automatically learning informative summary statistics of temporal responses. We show that the proposed network can effectively circumvent the statistics selection problem of the preprocessing step for ABC inference. The proposed approach is demonstrated on two benchmark problem and one challenging inference problem learning parameters in a high-dimensional stochastic genetic oscillator. We also study the impact of experimental design on network performance by comparing different data richness and data acquisition strategies.
MLMay 22, 2018
Multi-Statistic Approximate Bayesian Computation with Multi-Armed BanditsPrashant Singh, Andreas Hellander
Approximate Bayesian computation is an established and popular method for likelihood-free inference with applications in many disciplines. The effectiveness of the method depends critically on the availability of well performing summary statistics. Summary statistic selection relies heavily on domain knowledge and carefully engineered features, and can be a laborious time consuming process. Since the method is sensitive to data dimensionality, the process of selecting summary statistics must balance the need to include informative statistics and the dimensionality of the feature vector. This paper proposes to treat the problem of dynamically selecting an appropriate summary statistic from a given pool of candidate summary statistics as a multi-armed bandit problem. This allows approximate Bayesian computation rejection sampling to dynamically focus on a distribution over well performing summary statistics as opposed to a fixed set of statistics. The proposed method is unique in that it does not require any pre-processing and is scalable to a large number of candidate statistics. This enables efficient use of a large library of possible time series summary statistics without prior feature engineering. The proposed approach is compared to state-of-the-art methods for summary statistics selection using a challenging test problem from the systems biology literature.
NASep 5, 2017
Mesoscopic-microscopic spatial stochastic simulation with automatic system partitioningStefan Hellander, Andreas Hellander, Linda Petzold
The reaction-diffusion master equation (RDME) is a model that allows for efficient on-lattice simulation of spatially resolved stochastic chemical kinetics. Compared to off-lattice hard-sphere simulations with Brownian Dynamics (BD) or Green's Function Reaction Dynamics (GFRD) the RDME can be orders of magnitude faster if the lattice spacing can be chosen coarse enough. However, strongly diffusion-controlled reactions mandate a very fine mesh resolution for acceptable accuracy. It is common that reactions in the same model differ in their degree of diffusion control and therefore require different degrees of mesh resolution. This renders mesoscopic simulation inefficient for systems with multiscale properties. Mesoscopic-microscopic hybrid methods address this problem by resolving the most challenging reactions with a microscale, off-lattice simulation. However, all methods to date require manual partitioning of a system, effectively limiting their usefulness as 'black-box' simulation codes. In this paper we propose a hybrid simulation algorithm with automatic system partitioning based on indirect a priori error estimates. We demonstrate the accuracy and efficiency of the method on models of diffusion-controlled networks in 3D.
APMar 24, 2015
Mesoscopic modeling of stochastic reaction-diffusion kinetics in the subdiffusive regimeEmilie Blanc, Stefan Engblom, Andreas Hellander et al.
Subdiffusion has been proposed as an explanation of various kinetic phenomena inside living cells. In order to fascilitate large-scale computational studies of subdiffusive chemical processes, we extend a recently suggested mesoscopic model of subdiffusion into an accurate and consistent reaction-subdiffusion computational framework. Two different possible models of chemical reaction are revealed and some basic dynamic properties are derived. In certain cases those mesoscopic models have a direct interpretation at the macroscopic level as fractional partial differential equations in a bounded time interval. Through analysis and numerical experiments we estimate the macroscopic effects of reactions under subdiffusive mixing. The models display properties observed also in experiments: for a short time interval the behavior of the diffusion and the reaction is ordinary, in an intermediate interval the behavior is anomalous, and at long times the behavior is ordinary again.