LGMar 18, 2023
Machine learning with data assimilation and uncertainty quantification for dynamical systems: a reviewSibo Cheng, Cesar Quilodran-Casas, Said Ouala et al.
Data Assimilation (DA) and Uncertainty quantification (UQ) are extensively used in analysing and reducing error propagation in high-dimensional spatial-temporal dynamics. Typical applications span from computational fluid dynamics (CFD) to geoscience and climate systems. Recently, much effort has been given in combining DA, UQ and machine learning (ML) techniques. These research efforts seek to address some critical challenges in high-dimensional dynamical systems, including but not limited to dynamical system identification, reduced order surrogate modelling, error covariance specification and model error correction. A large number of developed techniques and methodologies exhibit a broad applicability across numerous domains, resulting in the necessity for a comprehensive guide. This paper provides the first overview of the state-of-the-art researches in this interdisciplinary field, covering a wide range of applications. This review aims at ML scientists who attempt to apply DA and UQ techniques to improve the accuracy and the interpretability of their models, but also at DA and UQ experts who intend to integrate cutting-edge ML approaches to their systems. Therefore, this article has a special focus on how ML methods can overcome the existing limits of DA and UQ, and vice versa. Some exciting perspectives of this rapidly developing research field are also discussed.
MLOct 25, 2022
Online model error correction with neural networks in the incremental 4D-Var frameworkAlban Farchi, Marcin Chrust, Marc Bocquet et al.
Recent studies have demonstrated that it is possible to combine machine learning with data assimilation to reconstruct the dynamics of a physical model partially and imperfectly observed. Data assimilation is used to estimate the system state from the observations, while machine learning computes a surrogate model of the dynamical system based on those estimated states. The surrogate model can be defined as an hybrid combination where a physical model based on prior knowledge is enhanced with a statistical model estimated by a neural network. The training of the neural network is typically done offline, once a large enough dataset of model state estimates is available. By contrast, with online approaches the surrogate model is improved each time a new system state estimate is computed. Online approaches naturally fit the sequential framework encountered in geosciences where new observations become available with time. In a recent methodology paper, we have developed a new weak-constraint 4D-Var formulation which can be used to train a neural network for online model error correction. In the present article, we develop a simplified version of that method, in the incremental 4D-Var framework adopted by most operational weather centres. The simplified method is implemented in the ECMWF Object-Oriented Prediction System, with the help of a newly developed Fortran neural network library, and tested with a two-layer two-dimensional quasi geostrophic model. The results confirm that online learning is effective and yields a more accurate model error correction than offline learning. Finally, the simplified method is compatible with future applications to state-of-the-art models such as the ECMWF Integrated Forecasting System.
NEMar 21, 2022
Voltage-Dependent Synaptic Plasticity (VDSP): Unsupervised probabilistic Hebbian plasticity rule based on neurons membrane potentialNikhil Garg, Ismael Balafrej, Terrence C. Stewart et al.
This study proposes voltage-dependent-synaptic plasticity (VDSP), a novel brain-inspired unsupervised local learning rule for the online implementation of Hebb's plasticity mechanism on neuromorphic hardware. The proposed VDSP learning rule updates the synaptic conductance on the spike of the postsynaptic neuron only, which reduces by a factor of two the number of updates with respect to standard spike-timing-dependent plasticity (STDP). This update is dependent on the membrane potential of the presynaptic neuron, which is readily available as part of neuron implementation and hence does not require additional memory for storage. Moreover, the update is also regularized on synaptic weight and prevents explosion or vanishing of weights on repeated stimulation. Rigorous mathematical analysis is performed to draw an equivalence between VDSP and STDP. To validate the system-level performance of VDSP, we train a single-layer spiking neural network (SNN) for the recognition of handwritten digits. We report 85.01 $ \pm $ 0.76% (Mean $ \pm $ S.D.) accuracy for a network of 100 output neurons on the MNIST dataset. The performance improves when scaling the network size (89.93 $ \pm $ 0.41% for 400 output neurons, 90.56 $ \pm $ 0.27 for 500 neurons), which validates the applicability of the proposed learning rule for spatial pattern recognition tasks. Future work will consider more complicated tasks. Interestingly, the learning rule better adapts than STDP to the frequency of input signal and does not require hand-tuning of hyperparameters
51.0LGMar 26Code
Anchored-Branched Steady-state WInd Flow Transformer (AB-SWIFT): a metamodel for 3D atmospheric flow in urban environmentsArmand de Villeroché, Rem-Sophia Mouradi, Vincent Le Guen et al.
Air flow modeling at a local scale is essential for applications such as pollutant dispersion modeling or wind farm modeling. To circumvent costly Computational Fluid Dynamics (CFD) computations, deep learning surrogate models have recently emerged as promising alternatives. However, in the context of urban air flow, deep learning models struggle to adapt to the high variations of the urban geometry and to large mesh sizes. To tackle these challenges, we introduce Anchored Branched Steady-state WInd Flow Transformer (AB-SWIFT), a transformer-based model with an internal branched structure uniquely designed for atmospheric flow modeling. We train our model on a specially designed database of atmospheric simulations around randomised urban geometries and with a mixture of unstable, neutral, and stable atmospheric stratifications. Our model reaches the best accuracy on all predicted fields compared to state-of-the-art transformers and graph-based models. Our code and data is available at https://github.com/cerea-daml/abswift.
LGFeb 13, 2025
Machine learning for modelling unstructured grid data in computational physics: a reviewSibo Cheng, Marc Bocquet, Weiping Ding et al.
Unstructured grid data are essential for modelling complex geometries and dynamics in computational physics. Yet, their inherent irregularity presents significant challenges for conventional machine learning (ML) techniques. This paper provides a comprehensive review of advanced ML methodologies designed to handle unstructured grid data in high-dimensional dynamical systems. Key approaches discussed include graph neural networks, transformer models with spatial attention mechanisms, interpolation-integrated ML methods, and meshless techniques such as physics-informed neural networks. These methodologies have proven effective across diverse fields, including fluid dynamics and environmental simulations. This review is intended as a guidebook for computational scientists seeking to apply ML approaches to unstructured grid data in their domains, as well as for ML researchers looking to address challenges in computational physics. It places special focus on how ML methods can overcome the inherent limitations of traditional numerical techniques and, conversely, how insights from computational physics can inform ML development. To support benchmarking, this review also provides a summary of open-access datasets of unstructured grid data in computational physics. Finally, emerging directions such as generative models with unstructured data, reinforcement learning for mesh generation, and hybrid physics-data-driven paradigms are discussed to inspire future advancements in this evolving field.
LGFeb 18, 2025
Ensemble Kalman filter in latent space using a variational autoencoder pairIvo Pasmans, Yumeng Chen, Tobias Sebastian Finn et al.
Popular (ensemble) Kalman filter data assimilation (DA) approaches assume that the errors in both the a priori estimate of the state and those in the observations are Gaussian. For constrained variables, e.g. sea ice concentration or stress, such an assumption does not hold. The variational autoencoder (VAE) is a machine learning (ML) technique that allows to map an arbitrary distribution to/from a latent space in which the distribution is supposedly closer to a Gaussian. We propose a novel hybrid DA-ML approach in which VAEs are incorporated in the DA procedure. Specifically, we introduce a variant of the popular ensemble transform Kalman filter (ETKF) in which the analysis is applied in the latent space of a single VAE or a pair of VAEs. In twin experiments with a simple circular model, whereby the circle represents an underlying submanifold to be respected, we find that the use of a VAE ensures that a posteri ensemble members lie close to the manifold containing the truth. Furthermore, online updating of the VAE is necessary and achievable when this manifold varies in time, i.e. when it is non-stationary. We demonstrate that introducing an additional second latent space for the observational innovations improves robustness against detrimental effects of non-Gaussianity and bias in the observational errors but it slightly lessens the performance if observational errors are strictly Gaussian.
MLMar 6, 2024
Development of an offline and online hybrid model for the Integrated Forecasting SystemAlban Farchi, Marcin Chrust, Marc Bocquet et al.
In recent years, there has been significant progress in the development of fully data-driven global numerical weather prediction models. These machine learning weather prediction models have their strength, notably accuracy and low computational requirements, but also their weakness: they struggle to represent fundamental dynamical balances, and they are far from being suitable for data assimilation experiments. Hybrid modelling emerges as a promising approach to address these limitations. Hybrid models integrate a physics-based core component with a statistical component, typically a neural network, to enhance prediction capabilities. In this article, we propose to develop a model error correction for the operational Integrated Forecasting System (IFS) of the European Centre for Medium-Range Weather Forecasts using a neural network. The neural network is initially pre-trained offline using a large dataset of operational analyses and analysis increments. Subsequently, the trained network is integrated into the IFS within the Object-Oriented Prediction System (OOPS) so as to be used in data assimilation and forecast experiments. It is then further trained online using a recently developed variant of weak-constraint 4D-Var. The results show that the pre-trained neural network already provides a reliable model error correction, which translates into reduced forecast errors in many conditions and that the online training further improves the accuracy of the hybrid model in many conditions.
AO-PHAug 20, 2025
Generative AI models capture realistic sea-ice evolution from days to decadesTobias Sebastian Finn, Marc Bocquet, Pierre Rampal et al.
Sea ice plays an important role in stabilising the Earth system. Yet, representing its dynamics remains a major challenge for models, as the underlying processes are scale-invariant and highly anisotropic. This poses a dilemma: physics-based models that faithfully reproduce the observed dynamics are computationally costly, while efficient AI models sacrifice realism. Here, to resolve this dilemma, we introduce GenSIM, the first generative AI model to predict the evolution of the full Arctic sea-ice state at 12-hour increments. Trained for sub-daily forecasting on 20 years of sea-ice-ocean simulation data, GenSIM makes realistic predictions for 30 years, while reproducing the dynamical properties of sea ice with its leads and ridges and capturing long-term trends in the sea-ice volume. Notably, although solely driven by atmospheric reanalysis, GenSIM implicitly learns hidden signatures of multi-year ice-ocean interaction. Therefore, generative AI can extrapolate from sub-daily forecasts to decadal simulations, while retaining physical consistency.
LGJul 1, 2025
A Probabilistic Approach to Wildfire Spread Prediction Using a Denoising Diffusion Surrogate ModelWenbo Yu, Anirbit Ghosh, Tobias Sebastian Finn et al.
Thanks to recent advances in generative AI, computers can now simulate realistic and complex natural processes. We apply this capability to predict how wildfires spread, a task made difficult by the unpredictable nature of fire and the variety of environmental conditions it depends on. In this study, We present the first denoising diffusion model for predicting wildfire spread, a new kind of AI framework that learns to simulate fires not just as one fixed outcome, but as a range of possible scenarios. By doing so, it accounts for the inherent uncertainty of wildfire dynamics, a feature that traditional models typically fail to represent. Unlike deterministic approaches that generate a single prediction, our model produces ensembles of forecasts that reflect physically meaningful distributions of where fire might go next. This technology could help us develop smarter, faster, and more reliable tools for anticipating wildfire behavior, aiding decision-makers in fire risk assessment and response planning.
LGOct 14, 2025
Bridging Idealized and Operational Models: An Explainable AI Framework for Earth System EmulatorsPouria Behnoudfar, Charlotte Moser, Marc Bocquet et al.
Computer models are indispensable tools for understanding the Earth system. While high-resolution operational models have achieved many successes, they exhibit persistent biases, particularly in simulating extreme events and statistical distributions. In contrast, coarse-grained idealized models isolate fundamental processes and can be precisely calibrated to excel in characterizing specific dynamical and statistical features. However, different models remain siloed by disciplinary boundaries. By leveraging the complementary strengths of models of varying complexity, we develop an explainable AI framework for Earth system emulators. It bridges the model hierarchy through a reconfigured latent data assimilation technique, uniquely suited to exploit the sparse output from the idealized models. The resulting bridging model inherits the high resolution and comprehensive variables of operational models while achieving global accuracy enhancements through targeted improvements from idealized models. Crucially, the mechanism of AI provides a clear rationale for these advancements, moving beyond black-box correction to physically insightful understanding in a computationally efficient framework that enables effective physics-assisted digital twins and uncertainty quantification. We demonstrate its power by significantly correcting biases in CMIP6 simulations of El Niño spatiotemporal patterns, leveraging statistically accurate idealized models. This work also highlights the importance of pushing idealized model development and advancing communication between modeling communities.
LGJun 26, 2024
Towards diffusion models for large-scale sea-ice modellingTobias Sebastian Finn, Charlotte Durand, Alban Farchi et al.
We make the first steps towards diffusion models for unconditional generation of multivariate and Arctic-wide sea-ice states. While targeting to reduce the computational costs by diffusion in latent space, latent diffusion models also offer the possibility to integrate physical knowledge into the generation process. We tailor latent diffusion models to sea-ice physics with a censored Gaussian distribution in data space to generate data that follows the physical bounds of the modelled variables. Our latent diffusion models reach similar scores as the diffusion model trained in data space, but they smooth the generated fields as caused by the latent mapping. While enforcing physical bounds cannot reduce the smoothing, it improves the representation of the marginal ice zone. Therefore, for large-scale Earth system modelling, latent diffusion models can have many advantages compared to diffusion in data space if the significant barrier of smoothing can be resolved.
MLJul 23, 2021
State, global and local parameter estimation using local ensemble Kalman filters: applications to online machine learning of chaotic dynamicsQuentin Malartic, Alban Farchi, Marc Bocquet
In a recent methodological paper, we showed how to learn chaotic dynamics along with the state trajectory from sequentially acquired observations, using local ensemble Kalman filters. Here, we more systematically investigate the possibility to use a local ensemble Kalman filter with either covariance localisation or local domains, in order to retrieve the state and a mix of key global and local parameters. Global parameters are meant to represent the surrogate dynamical core, for instance through a neural network, which is reminiscent of data-driven machine learning of dynamics, while the local parameters typically stand for the forcings of the model. Aiming at joint state and parameter estimation, a family of algorithms for covariance and local domain localisation is proposed. In particular, we show how to rigorously update global parameters using a local domain ensemble Kalman filter (EnKF) such as the local ensemble transform Kalman filter (LETKF), an inherently local method. The approach is tested with success on the 40-variable Lorenz model using several of the local EnKF flavors. A two-dimensional illustration based on a multi-layer Lorenz model is finally provided. It uses radiance-like non-local observations. It features both local domains and covariance localisation in order to learn the chaotic dynamics and the local forcings. This paper more generally addresses the key question of online estimation of both global and local model parameters.
MLJul 23, 2021
A comparison of combined data assimilation and machine learning methods for offline and online model error correctionAlban Farchi, Marc Bocquet, Patrick Laloyaux et al.
Recent studies have shown that it is possible to combine machine learning methods with data assimilation to reconstruct a dynamical system using only sparse and noisy observations of that system. The same approach can be used to correct the error of a knowledge-based model. The resulting surrogate model is hybrid, with a statistical part supplementing a physical part. In practice, the correction can be added as an integrated term (i.e. in the model resolvent) or directly inside the tendencies of the physical model. The resolvent correction is easy to implement. The tendency correction is more technical, in particular it requires the adjoint of the physical model, but also more flexible. We use the two-scale Lorenz model to compare the two methods. The accuracy in long-range forecast experiments is somewhat similar between the surrogate models using the resolvent correction and the tendency correction. By contrast, the surrogate models using the tendency correction significantly outperform the surrogate models using the resolvent correction in data assimilation experiments. Finally, we show that the tendency correction opens the possibility to make online model error correction, i.e. improving the model progressively as new observations become available. The resulting algorithm can be seen as a new formulation of weak-constraint 4D-Var. We compare online and offline learning using the same framework with the two-scale Lorenz system, and show that with online learning, it is possible to extract all the information from sparse and noisy observations.
LGJul 2, 2021
Model of the Weak Reset Process in HfOx Resistive Memory for Deep Learning FrameworksAtreya Majumdar, Marc Bocquet, Tifenn Hirtzlin et al.
The implementation of current deep learning training algorithms is power-hungry, owing to data transfer between memory and logic units. Oxide-based RRAMs are outstanding candidates to implement in-memory computing, which is less power-intensive. Their weak RESET regime, is particularly attractive for learning, as it allows tuning the resistance of the devices with remarkable endurance. However, the resistive change behavior in this regime suffers many fluctuations and is particularly challenging to model, especially in a way compatible with tools used for simulating deep learning. In this work, we present a model of the weak RESET process in hafnium oxide RRAM and integrate this model within the PyTorch deep learning framework. Validated on experiments on a hybrid CMOS/RRAM technology, our model reproduces both the noisy progressive behavior and the device-to-device (D2D) variability. We use this tool to train Binarized Neural Networks for the MNIST handwritten digit recognition task and the CIFAR-10 object classification task. We simulate our model with and without various aspects of device imperfections to understand their impact on the training process and identify that the D2D variability is the most detrimental aspect. The framework can be used in the same manner for other types of memories to identify the device imperfections that cause the most degradation, which can, in turn, be used to optimize the devices to reduce the impact of these imperfections.
MLOct 23, 2020
Using machine learning to correct model error in data assimilation and forecast applicationsAlban Farchi, Patrick Laloyaux, Massimo Bonavita et al.
The idea of using machine learning (ML) methods to reconstruct the dynamics of a system is the topic of recent studies in the geosciences, in which the key output is a surrogate model meant to emulate the dynamical model. In order to treat sparse and noisy observations in a rigorous way, ML can be combined to data assimilation (DA). This yields a class of iterative methods in which, at each iteration a DA step assimilates the observations, and alternates with a ML step to learn the underlying dynamics of the DA analysis. In this article, we propose to use this method to correct the error of an existent, knowledge-based model. In practice, the resulting surrogate model is an hybrid model between the original (knowledge-based) model and the ML model. We demonstrate numerically the feasibility of the method using a two-layer, two-dimensional quasi-geostrophic channel model. Model error is introduced by the means of perturbed parameters. The DA step is performed using the strong-constraint 4D-Var algorithm, while the ML step is performed using deep learning tools. The ML models are able to learn a substantial part of the model error and the resulting hybrid surrogate models produce better short- to mid-range forecasts. Furthermore, using the hybrid surrogate models for DA yields a significantly better analysis than using the original model.
COMP-PHSep 9, 2020
Combining data assimilation and machine learning to infer unresolved scale parametrisationJulien Brajard, Alberto Carrassi, Marc Bocquet et al.
In recent years, machine learning (ML) has been proposed to devise data-driven parametrisations of unresolved processes in dynamical numerical models. In most cases, the ML training leverages high-resolution simulations to provide a dense, noiseless target state. Our goal is to go beyond the use of high-resolution simulations and train ML-based parametrisation using direct data, in the realistic scenario of noisy and sparse observations. The algorithm proposed in this work is a two-step process. First, data assimilation (DA) techniques are applied to estimate the full state of the system from a truncated model. The unresolved part of the truncated model is viewed as a model error in the DA system. In a second step, ML is used to emulate the unresolved part, a predictor of model error given the state of the system. Finally, the ML-based parametrisation model is added to the physical core truncated model to produce a hybrid model. The DA component of the proposed method relies on an ensemble Kalman filter while the ML parametrisation is represented by a neural network. The approach is applied to the two-scale Lorenz model and to MAOOAM, a reduced-order coupled ocean-atmosphere model. We show that in both cases the hybrid model yields forecasts with better skill than the truncated model. Moreover, the attractor of the system is significantly better represented by the hybrid model than by the truncated model.
MLJun 6, 2020
Online learning of both state and dynamics using ensemble Kalman filtersMarc Bocquet, Alban Farchi, Quentin Malartic
The reconstruction of the dynamics of an observed physical system as a surrogate model has been brought to the fore by recent advances in machine learning. To deal with partial and noisy observations in that endeavor, machine learning representations of the surrogate model can be used within a Bayesian data assimilation framework. However, these approaches require to consider long time series of observational data, meant to be assimilated all together. This paper investigates the possibility to learn both the dynamics and the state online, i.e. to update their estimates at any time, in particular when new observations are acquired. The estimation is based on the ensemble Kalman filter (EnKF) family of algorithms using a rather simple representation for the surrogate model and state augmentation. We consider the implication of learning dynamics online through (i) a global EnKF, (i) a local EnKF and (iii) an iterative EnKF and we discuss in each case issues and algorithmic solutions. We then demonstrate numerically the efficiency and assess the accuracy of these methods using one-dimensional, one-scale and two-scale chaotic Lorenz models.
MLJan 17, 2020
Bayesian inference of chaotic dynamics by merging data assimilation, machine learning and expectation-maximizationMarc Bocquet, Julien Brajard, Alberto Carrassi et al.
The reconstruction from observations of high-dimensional chaotic dynamics such as geophysical flows is hampered by (i) the partial and noisy observations that can realistically be obtained, (ii) the need to learn from long time series of data, and (iii) the unstable nature of the dynamics. To achieve such inference from the observations over long time series, it has been suggested to combine data assimilation and machine learning in several ways. We show how to unify these approaches from a Bayesian perspective using expectation-maximization and coordinate descents. In doing so, the model, the state trajectory and model error statistics are estimated all together. Implementations and approximations of these methods are discussed. Finally, we numerically and successfully test the approach on two relevant low-order chaotic models with distinct identifiability.
MLJan 6, 2020
Combining data assimilation and machine learning to emulate a dynamical model from sparse and noisy observations: a case study with the Lorenz 96 modelJulien Brajard, Alberto Carassi, Marc Bocquet et al.
A novel method, based on the combination of data assimilation and machine learning is introduced. The new hybrid approach is designed for a two-fold scope: (i) emulating hidden, possibly chaotic, dynamics and (ii) predicting their future states. The method consists in applying iteratively a data assimilation step, here an ensemble Kalman filter, and a neural network. Data assimilation is used to optimally combine a surrogate model with sparse noisy data. The output analysis is spatially complete and is used as a training set by the neural network to update the surrogate model. The two steps are then repeated iteratively. Numerical experiments have been carried out using the chaotic 40-variables Lorenz 96 model, proving both convergence and statistical skill of the proposed hybrid approach. The surrogate model shows short-term forecast skill up to two Lyapunov times, the retrieval of positive Lyapunov exponents as well as the more energetic frequencies of the power density spectrum. The sensitivity of the method to critical setup parameters is also presented: the forecast skill decreases smoothly with increased observational noise but drops abruptly if less than half of the model domain is observed. The successful synergy between data assimilation and machine learning, proven here with a low-dimensional system, encourages further investigation of such hybrids with more sophisticated dynamics.
ETAug 12, 2019
Implementing Binarized Neural Networks with Magnetoresistive RAM without Error CorrectionTifenn Hirtzlin, Bogdan Penkovsky, Jacques-Olivier Klein et al.
One of the most exciting applications of Spin Torque Magnetoresistive Random Access Memory (ST-MRAM) is the in-memory implementation of deep neural networks, which could allow improving the energy efficiency of Artificial Intelligence by orders of magnitude with regards to its implementation on computers and graphics cards. In particular, ST-MRAM could be ideal for implementing Binarized Neural Networks (BNNs), a type of deep neural networks discovered in 2016, which can achieve state-of-the-art performance with a highly reduced memory footprint with regards to conventional artificial intelligence approaches. The challenge of ST-MRAM, however, is that it is prone to write errors and usually requires the use of error correction. In this work, we show that these bit errors can be tolerated by BNNs to an outstanding level, based on examples of image recognition tasks (MNIST, CIFAR-10 and ImageNet): bit error rates of ST-MRAM up to 0.1% have little impact on recognition accuracy. The requirements for ST-MRAM are therefore considerably relaxed for BNNs with regards to traditional applications. By consequence, we show that for BNNs, ST-MRAMs can be programmed with weak (low-energy) programming conditions, without error correcting codes. We show that this result can allow the use of low energy and low area ST-MRAM cells, and show that the energy savings at the system level can reach a factor two.