J. Nathan Kutz

LG
h-index101
113papers
5,718citations
Novelty44%
AI Score57

113 Papers

SYJun 22, 2023Code
PyKoopman: A Python Package for Data-Driven Approximation of the Koopman Operator

Shaowu Pan, Eurika Kaiser, Brian M. de Silva et al. · amazon-science

PyKoopman is a Python package for the data-driven approximation of the Koopman operator associated with a dynamical system. The Koopman operator is a principled linear embedding of nonlinear dynamics and facilitates the prediction, estimation, and control of strongly nonlinear dynamics using linear systems theory. In particular, PyKoopman provides tools for data-driven system identification for unforced and actuated systems that build on the equation-free dynamic mode decomposition (DMD) and its variants. In this work, we provide a brief description of the mathematical underpinnings of the Koopman operator, an overview and demonstration of the features implemented in PyKoopman (with code examples), practical advice for users, and a list of potential extensions to PyKoopman. Software is available at http://github.com/dynamicslab/pykoopman

NAAug 22, 2018Code
Deep learning of dynamics and signal-noise decomposition with time-stepping constraints

Samuel H. Rudy, J. Nathan Kutz, Steven L. Brunton

A critical challenge in the data-driven modeling of dynamical systems is producing methods robust to measurement error, particularly when data is limited. Many leading methods either rely on denoising prior to learning or on access to large volumes of data to average over the effect of noise. We propose a novel paradigm for data-driven modeling that simultaneously learns the dynamics and estimates the measurement noise at each observation. By constraining our learning algorithm, our method explicitly accounts for measurement error in the map between observations, treating both the measurement error and the dynamics as unknowns to be identified, rather than assuming idealized noiseless trajectories. We model the unknown vector field using a deep neural network, imposing a Runge-Kutta integrator structure to isolate this vector field, even when the data has a non-uniform timestep, thus constraining and focusing the modeling effort. We demonstrate the ability of this framework to form predictive models on a variety of canonical test problems of increasing complexity and show that it is robust to substantial amounts of measurement error. We also discuss issues with the generalizability of neural network models for dynamical systems and provide open-source code for all examples.

SYNov 29, 2017Code
Dynamic mode decomposition for compressive system identification

Zhe Bai, Eurika Kaiser, Joshua L. Proctor et al.

Dynamic mode decomposition has emerged as a leading technique to identify spatiotemporal coherent structures from high-dimensional data, benefiting from a strong connection to nonlinear dynamical systems via the Koopman operator. In this work, we integrate and unify two recent innovations that extend DMD to systems with actuation [Proctor et al., 2016] and systems with heavily subsampled measurements [Brunton et al., 2015]. When combined, these methods yield a novel framework for compressive system identification [code is publicly available at: https://github.com/zhbai/cDMDc]. It is possible to identify a low-order model from limited input-output data and reconstruct the associated full-state dynamic modes with compressed sensing, adding interpretability to the state of the reduced-order model. Moreover, when full-state data is available, it is possible to dramatically accelerate downstream computations by first compressing the data. We demonstrate this unified framework on two model systems, investigating the effects of sensor noise, different types of measurements (e.g., point sensors, Gaussian random projections, etc.), compression ratios, and different choices of actuation (e.g., localized, broadband, etc.). In the first example, we explore this architecture on a test system with known low-rank dynamics and an artificially inflated state dimension. The second example consists of a real-world engineering application given by the fluid flow past a pitching airfoil at low Reynolds number. This example provides a challenging and realistic test-case for the proposed method, and results demonstrate that the dominant coherent structures are well characterized despite actuation and heavily subsampled data.

LGJan 30, 2023
Convergence of uncertainty estimates in Ensemble and Bayesian sparse model discovery

L. Mars Gao, Urban Fasel, Steven L. Brunton et al. · uw

Sparse model identification enables nonlinear dynamical system discovery from data. However, the control of false discoveries for sparse model identification is challenging, especially in the low-data and high-noise limit. In this paper, we perform a theoretical study on ensemble sparse model discovery, which shows empirical success in terms of accuracy and robustness to noise. In particular, we analyse the bootstrapping-based sequential thresholding least-squares estimator. We show that this bootstrapping-based ensembling technique can perform a provably correct variable selection procedure with an exponential convergence rate of the error rate. In addition, we show that the ensemble sparse model discovery method can perform computationally efficient uncertainty estimation, compared to expensive Bayesian uncertainty quantification methods via MCMC. We demonstrate the convergence properties and connection to uncertainty quantification in various numerical studies on synthetic sparse linear regression and sparse model discovery. The experiments on sparse linear regression support that the bootstrapping-based sequential thresholding least-squares method has better performance for sparse variable selection compared to LASSO, thresholding least-squares, and bootstrapping-based LASSO. In the sparse model discovery experiment, we show that the bootstrapping-based sequential thresholding least-squares method can provide valid uncertainty quantification, converging to a delta measure centered around the true value with increased sample sizes. Finally, we highlight the improved robustness to hyperparameter selection under shifting noise and sparsity levels of the bootstrapping-based sequential thresholding least-squares method compared to other sparse regression methods.

NAJun 3, 2018
Data-driven identification of parametric partial differential equations

Samuel Rudy, Alessandro Alla, Steven L. Brunton et al.

In this work we present a data-driven method for the discovery of parametric partial differential equations (PDEs), thus allowing one to disambiguate between the underlying evolution equations and their parametric dependencies. Group sparsity is used to ensure parsimonious representations of observed dynamics in the form of a parametric PDE, while also allowing the coefficients to have arbitrary time series, or spatial dependence. This work builds on previous methods for the identification of constant coefficient PDEs, expanding the field to include a new class of equations which until now have eluded machine learning based identification methods. We show that group sequentially thresholded ridge regression outperforms group LASSO in identifying the fewest terms in the PDE along with their parametric dependency. The method is demonstrated on four canonical models with and without the introduction of noise.

LGOct 7, 2023
HyperSINDy: Deep Generative Modeling of Nonlinear Stochastic Governing Equations

Mozes Jacobs, Bingni W. Brunton, Steven L. Brunton et al. · uw

The discovery of governing differential equations from data is an open frontier in machine learning. The sparse identification of nonlinear dynamics (SINDy) \citep{brunton_discovering_2016} framework enables data-driven discovery of interpretable models in the form of sparse, deterministic governing laws. Recent works have sought to adapt this approach to the stochastic setting, though these adaptations are severely hampered by the curse of dimensionality. On the other hand, Bayesian-inspired deep learning methods have achieved widespread success in high-dimensional probabilistic modeling via computationally efficient approximate inference techniques, suggesting the use of these techniques for efficient stochastic equation discovery. Here, we introduce HyperSINDy, a framework for modeling stochastic dynamics via a deep generative model of sparse governing equations whose parametric form is discovered from data. HyperSINDy employs a variational encoder to approximate the distribution of observed states and derivatives. A hypernetwork \citep{ha_hypernetworks_2016} transforms samples from this distribution into the coefficients of a differential equation whose sparse form is learned simultaneously using a trainable binary mask \citep{louizos_learning_2018}. Once trained, HyperSINDy generates stochastic dynamics via a differential equation whose coefficients are driven by a Gaussian white noise. In experiments, HyperSINDy accurately recovers ground truth stochastic governing equations, with learned stochasticity scaling to match that of the data. Finally, HyperSINDy provides uncertainty quantification that scales to high-dimensional systems. Taken together, HyperSINDy offers a promising framework for model discovery and uncertainty quantification in real-world systems, integrating sparse equation discovery methods with advances in statistical machine learning and deep generative modeling.

NAMar 13, 2020
Randomized CP Tensor Decomposition

N. Benjamin Erichson, Krithika Manohar, Steven L. Brunton et al.

The CANDECOMP/PARAFAC (CP) tensor decomposition is a popular dimensionality-reduction method for multiway data. Dimensionality reduction is often sought after since many high-dimensional tensors have low intrinsic rank relative to the dimension of the ambient measurement space. However, the emergence of `big data' poses significant computational challenges for computing this fundamental tensor decomposition. By leveraging modern randomized algorithms, we demonstrate that coherent structures can be learned from a smaller representation of the tensor in a fraction of the time. Thus, this simple but powerful algorithm enables one to compute the approximate CP decomposition even for massive tensors. The approximation error can thereby be controlled via oversampling and the computation of power iterations. In addition to theoretical results, several empirical results demonstrate the performance of the proposed algorithm.

LGApr 7, 2022
Neural Implicit Flow: a mesh-agnostic dimensionality reduction paradigm of spatio-temporal data

Shaowu Pan, Steven L. Brunton, J. Nathan Kutz

High-dimensional spatio-temporal dynamics can often be encoded in a low-dimensional subspace. Engineering applications for modeling, characterization, design, and control of such large-scale systems often rely on dimensionality reduction to make solutions computationally tractable in real-time. Common existing paradigms for dimensionality reduction include linear methods, such as the singular value decomposition (SVD), and nonlinear methods, such as variants of convolutional autoencoders (CAE). However, these encoding techniques lack the ability to efficiently represent the complexity associated with spatio-temporal data, which often requires variable geometry, non-uniform grid resolution, adaptive meshing, and/or parametric dependencies. To resolve these practical engineering challenges, we propose a general framework called Neural Implicit Flow (NIF) that enables a mesh-agnostic, low-rank representation of large-scale, parametric, spatial-temporal data. NIF consists of two modified multilayer perceptrons (MLPs): (i) ShapeNet, which isolates and represents the spatial complexity, and (ii) ParameterNet, which accounts for any other input complexity, including parametric dependencies, time, and sensor measurements. We demonstrate the utility of NIF for parametric surrogate modeling, enabling the interpretable representation and compression of complex spatio-temporal dynamics, efficient many-spatial-query tasks, and improved generalization performance for sparse reconstruction.

LGNov 19, 2022
Bayesian autoencoders for data-driven discovery of coordinates, governing equations and fundamental constants

L. Mars Gao, J. Nathan Kutz · uw

Recent progress in autoencoder-based sparse identification of nonlinear dynamics (SINDy) under $\ell_1$ constraints allows joint discoveries of governing equations and latent coordinate systems from spatio-temporal data, including simulated video frames. However, it is challenging for $\ell_1$-based sparse inference to perform correct identification for real data due to the noisy measurements and often limited sample sizes. To address the data-driven discovery of physics in the low-data and high-noise regimes, we propose Bayesian SINDy autoencoders, which incorporate a hierarchical Bayesian sparsifying prior: Spike-and-slab Gaussian Lasso. Bayesian SINDy autoencoder enables the joint discovery of governing equations and coordinate systems with a theoretically guaranteed uncertainty estimate. To resolve the challenging computational tractability of the Bayesian hierarchical setting, we adapt an adaptive empirical Bayesian method with Stochatic gradient Langevin dynamics (SGLD) which gives a computationally tractable way of Bayesian posterior sampling within our framework. Bayesian SINDy autoencoder achieves better physics discovery with lower data and fewer training epochs, along with valid uncertainty quantification suggested by the experimental studies. The Bayesian SINDy autoencoder can be applied to real video data, with accurate physics discovery which correctly identifies the governing equation and provides a close estimate for standard physics constants like gravity $g$, for example, in videos of a pendulum.

NAJun 22, 2018
Data-driven Spatiotemporal Modal Decomposition for Time Frequency Analysis

Seth M. Hirsh, Bingni W. Brunton, J. Nathan Kutz · uw

We propose a new solution to the blind source separation problem that factors mixed time-series signals into a sum of spatiotemporal modes, with the constraint that the temporal components are intrinsic mode functions (IMF's). The key motivation is that IMF's allow the computation of meaningful Hilbert transforms of non-stationary data, from which instantaneous time-frequency representations may be derived. Our spatiotemporal intrinsic mode decomposition (STIMD) method leverages spatial correlations to generalize the extraction of IMF's from one-dimensional signals, commonly performed using the empirical mode decomposition (EMD), to multi-dimensional signals. Further, this data-driven method enables future-state prediction. We demonstrate STIMD on several synthetic examples, comparing it to common matrix factorization techniques, namely singular value decomposition (SVD), independent component analysis (ICA), and dynamic mode decomposition (DMD). We show that STIMD outperforms these methods at reconstruction and extracting interpretable modes. Next, we apply STIMD to analyze two real-world datasets, gravitational wave data and neural recordings from the rodent hippocampus.

DSDec 14, 2017
Optimized Sampling for Multiscale Dynamics

Krithika Manohar, Eurika Kaiser, Steven L. Brunton et al.

The characterization of intermittent, multiscale and transient dynamics using data-driven analysis remains an open challenge. We demonstrate an application of the Dynamic Mode Decomposition (DMD) with sparse sampling for the diagnostic analysis of multiscale physics. The DMD method is an ideal spatiotemporal matrix decomposition that correlates spatial features of computational or experimental data to periodic temporal behavior. DMD can be modified into a multiresolution analysis to separate complex dynamics into a hierarchy of multiresolution timescale components, where each level of the hierarchy divides dynamics into distinct background (slow) and foreground (fast) timescales. The multiresolution DMD is capable of characterizing nonlinear dynamical systems in an equation-free manner by recursively decomposing the state of the system into low-rank spatial modes and their temporal Fourier dynamics. Moreover, these multiresolution DMD modes can be used to determined sparse sampling locations which are nearly optimal for dynamic regime classification and full state reconstruction. Specifically, optimized sensors are efficiently chosen using QR column pivots of the DMD library, thus avoiding an NP-hard selection process. We demonstrate the efficacy of the method on several examples, including global sea-surface temperature data, and show that only a small number of sensors are needed for accurate global reconstructions and classification of El Niño events.

NAApr 1, 2017
Variable projection methods for an optimized dynamic mode decomposition

Travis Askham, J. Nathan Kutz

The dynamic mode decomposition (DMD) has become a leading tool for data-driven modeling of dynamical systems, providing a regression framework for fitting linear dynamical models to time-series measurement data. We present a simple algorithm for computing an optimized version of the DMD for data which may be collected at unevenly spaced sample times. By making use of the variable projection method for nonlinear least squares problems, the algorithm is capable of solving the underlying nonlinear optimization problem efficiently. We explore the performance of the algorithm with some numerical examples for synthetic and real data from dynamical systems and find that the resulting decomposition displays less bias in the presence of noise than standard DMD algorithms. Because of the flexibility of the algorithm, we also present some interesting new options for DMD-based analysis.

LGNov 1, 2023
A Unified Framework to Enforce, Discover, and Promote Symmetry in Machine Learning

Samuel E. Otto, Nicholas Zolman, J. Nathan Kutz et al.

Symmetry is present throughout nature and continues to play an increasingly central role in physics and machine learning. Fundamental symmetries, such as Poincaré invariance, allow physical laws discovered in laboratories on Earth to be extrapolated to the farthest reaches of the universe. Symmetry is essential to achieving this extrapolatory power in machine learning applications. For example, translation invariance in image classification allows models with fewer parameters, such as convolutional neural networks, to be trained on smaller data sets and achieve state-of-the-art performance. In this paper, we provide a unifying theoretical and methodological framework for incorporating symmetry into machine learning models in three ways: 1. enforcing known symmetry when training a model; 2. discovering unknown symmetries of a given model or data set; and 3. promoting symmetry during training by learning a model that breaks symmetries within a user-specified group of candidates when there is sufficient evidence in the data. We show that these tasks can be cast within a common mathematical framework whose central object is the Lie derivative associated with fiber-linear Lie group actions on vector bundles. We extend and unify several existing results by showing that enforcing and discovering symmetry are linear-algebraic tasks that are dual with respect to the bilinear structure of the Lie derivative. We also propose a novel way to promote symmetry by introducing a class of convex regularization functions based on the Lie derivative and nuclear norm relaxation to penalize symmetry breaking during training of machine learning models. We explain how these ideas can be applied to a wide range of machine learning models including basis function regression, dynamical systems discovery, neural networks, and neural operators acting on fields.

LGMar 30, 2023
Machine Learning for Partial Differential Equations

Steven L. Brunton, J. Nathan Kutz

Partial differential equations (PDEs) are among the most universal and parsimonious descriptions of natural physical laws, capturing a rich variety of phenomenology and multi-scale physics in a compact and symbolic representation. This review will examine several promising avenues of PDE research that are being advanced by machine learning, including: 1) the discovery of new governing PDEs and coarse-grained approximations for complex natural and engineered systems, 2) learning effective coordinate systems and reduced-order models to make PDEs more amenable to analysis, and 3) representing solution operators and improving traditional numerical algorithms. In each of these fields, we summarize key advances, ongoing challenges, and opportunities for further development.

NADec 1, 2018
Smoothing and parameter estimation by soft-adherence to governing equations

Samuel Rudy, Steven Brunton, J. Nathan Kutz

The analysis of high-dimensional dynamical systems generally requires the integration of simulation data with experimental measurements. Experimental data often has substantial amounts of measurement noise that compromises the ability to produce accurate dimensionality reduction, parameter estimation, reduced order models, and/or balanced models for control. Data assimilation attempts to overcome the deleterious effects of noise by producing a set of algorithms for state estimation from noisy and possibly incomplete measurements. Indeed, methods such as Kalman filtering and smoothing are vital tools for scientists in fields ranging from electronics to weather forecasting. In this work we develop a novel framework for smoothing data based on known or partially known nonlinear governing equations. The method yields superior results to current techniques when applied to problems with known deterministic dynamics. By exploiting the numerical time-stepping constraints of the deterministic system, an optimization formulation can readily extract the noise from the nonlinear dynamics in a principled manner. The superior performance is due in part to the fact that it optimizes global state estimates. We demonstrate the efficiency and efficacy of the method on a number of canonical examples, thus demonstrating its viability for the wide range of potential applications stated above.

NAFeb 16, 2016
Nonlinear model order reduction via Dynamic Mode Decomposition

Alessandro Alla, J. Nathan Kutz

We propose a new technique for obtaining reduced order models for nonlinear dynamical systems. Specifically, we advocate the use of the recently developed Dynamic Mode Decomposition (DMD), an equation-free method, to approximate the nonlinear term. DMD is a spatio-temporal matrix decomposition of a data matrix that correlates spatial features while simultaneously associating the activity with periodic temporal behavior. With this decomposition, one can obtain a fully reduced dimensional surrogate model and avoid the evaluation of the nonlinear term in the online stage. This allows for an impressive speed up of the computational cost, and, at the same time, accurate approximations of the problem. We present a suite of numerical tests to illustrate our approach and to show the effectiveness of the method in comparison to existing approaches.

CVJan 27, 2023
Deep Learning Based Object Tracking in Walking Droplet and Granular Intruder Experiments

Erdi Kara, George Zhang, Joseph J. Williams et al. · uw

We present a deep-learning based tracking objects of interest in walking droplet and granular intruder experiments. In a typical walking droplet experiment, a liquid droplet, known as \textit{walker}, propels itself laterally on the free surface of a vibrating bath of the same liquid. This motion is the result of the interaction between the droplets and the surface waves generated by the droplet itself after each successive bounce. A walker can exhibit a highly irregular trajectory over the course of its motion, including rapid acceleration and complex interactions with the other walkers present in the same bath. In analogy with the hydrodynamic experiments, the granular matter experiments consist of a vibrating bath of very small solid particles and a larger solid \textit{intruder}. Like the fluid droplets, the intruder interacts with and travels the domain due to the waves of the bath but tends to move much slower and much less smoothly than the droplets. When multiple intruders are introduced, they also exhibit complex interactions with each other. We leverage the state-of-art object detection model YOLO and the Hungarian Algorithm to accurately extract the trajectory of a walker or intruder in real-time. Our proposed methodology is capable of tracking individual walker(s) or intruder(s) in digital images acquired from a broad spectrum of experimental settings and does not suffer from any identity-switch issues. Thus, the deep learning approach developed in this work could be used to automatize the efficient, fast and accurate extraction of observables of interests in walking droplet and granular flow experiments. Such extraction capabilities are critically enabling for downstream tasks such as building data-driven dynamical models for the coarse-grained dynamics and interactions of the objects of interest.

ROSep 20, 2022
Robust, High-Rate Trajectory Tracking on Insect-Scale Soft-Actuated Aerial Robots with Deep-Learned Tube MPC

Andrea Tagliabue, Yi-Hsuan Hsiao, Urban Fasel et al.

Accurate and agile trajectory tracking in sub-gram Micro Aerial Vehicles (MAVs) is challenging, as the small scale of the robot induces large model uncertainties, demanding robust feedback controllers, while the fast dynamics and computational constraints prevent the deployment of computationally expensive strategies. In this work, we present an approach for agile and computationally efficient trajectory tracking on the MIT SoftFly, a sub-gram MAV (0.7 grams). Our strategy employs a cascaded control scheme, where an adaptive attitude controller is combined with a neural network policy trained to imitate a trajectory tracking robust tube model predictive controller (RTMPC). The neural network policy is obtained using our recent work, which enables the policy to preserve the robustness of RTMPC, but at a fraction of its computational cost. We experimentally evaluate our approach, achieving position Root Mean Square Errors lower than 1.8 cm even in the more challenging maneuvers, obtaining a 60% reduction in maximum position error compared to our previous work, and demonstrating robustness to large external disturbances

PLASM-PHAug 26, 2023
Dynamic Mode Decomposition for data-driven analysis and reduced-order modelling of ExB plasmas: I. Extraction of spatiotemporally coherent patterns

Farbod Faraji, Maryam Reza, Aaron Knoll et al.

In this two-part article, we evaluate the utility and the generalizability of the Dynamic Mode Decomposition (DMD) algorithm for data-driven analysis and reduced-order modelling of plasma dynamics in cross-field ExB configurations. The DMD algorithm is an interpretable data-driven method that finds a best-fit linear model describing the time evolution of spatiotemporally coherent structures (patterns) in data. We have applied the DMD to extensive high-fidelity datasets generated using a particle-in-cell (PIC) code based on a cost-efficient reduced-order PIC scheme. In this part, we first provide an overview of the concept of DMD and its underpinning Proper Orthogonal and Singular Value Decomposition methods. Two of the main DMD variants are next introduced. We then present and discuss the results of the DMD application in terms of the identification and extraction of the dominant spatiotemporal modes from high-fidelity data over a range of simulation conditions. We demonstrate that the DMD variant based on variable projection optimization (OPT-DMD) outperforms the basic DMD method in identification of the modes underlying the data, leading to notably more reliable reconstruction of the ground-truth. Furthermore, we show in multiple test cases that the discrete frequency spectrum of OPT-DMD-extracted modes is consistent with the temporal spectrum from the Fast Fourier Transform of the data. This observation implies that the OPT-DMD augments the conventional spectral analyses by being able to uniquely reveal the spatial structure of the dominant modes in the frequency spectra, thus, yielding more accessible, comprehensive information on the spatiotemporal characteristics of the plasma phenomena.

SYJun 20, 2019
Dynamic mode decomposition for multiscale nonlinear physics

Daniel Dylewsky, Molei Tao, J. Nathan Kutz

We present a data-driven method for separating complex, multiscale systems into their constituent time-scale components using a recursive implementation of dynamic mode decomposition (DMD). Local linear models are built from windowed subsets of the data, and dominant time scales are discovered using spectral clustering on their eigenvalues. This approach produces time series data for each identified component, which sum to a faithful reconstruction of the input signal. It differs from most other methods in the field of multiresolution analysis (MRA) in that it 1) accounts for spatial and temporal coherencies simultaneously, making it more robust to scale overlap between components, and 2) yields a closed-form expression for local dynamics at each scale, which can be used for short-term prediction of any or all components. Our technique is an extension of multi-resolution dynamic mode decomposition (mrDMD), generalized to treat a broader variety of multiscale systems and more faithfully reconstruct their isolated components. In this paper we present an overview of our algorithm and its results on two example physical systems, and briefly discuss some advantages and potential forecasting applications for the technique.

LGNov 6, 2025
Accelerating scientific discovery with the common task framework

J. Nathan Kutz, Peter Battaglia, Michael Brenner et al.

Machine learning (ML) and artificial intelligence (AI) algorithms are transforming and empowering the characterization and control of dynamic systems in the engineering, physical, and biological sciences. These emerging modeling paradigms require comparative metrics to evaluate a diverse set of scientific objectives, including forecasting, state reconstruction, generalization, and control, while also considering limited data scenarios and noisy measurements. We introduce a common task framework (CTF) for science and engineering, which features a growing collection of challenge data sets with a diverse set of practical and common objectives. The CTF is a critically enabling technology that has contributed to the rapid advance of ML/AI algorithms in traditional applications such as speech recognition, language processing, and computer vision. There is a critical need for the objective metrics of a CTF to compare the diverse algorithms being rapidly developed and deployed in practice today across science and engineering.

PLASM-PHAug 26, 2023
Dynamic Mode Decomposition for data-driven analysis and reduced-order modelling of ExB plasmas: II. dynamics forecasting

Farbod Faraji, Maryam Reza, Aaron Knoll et al.

In part I of the article, we demonstrated that a variant of the Dynamic Mode Decomposition (DMD) algorithm based on variable projection optimization, called Optimized DMD (OPT-DMD), enables a robust identification of the dominant spatiotemporally coherent modes underlying the data across various test cases representing different physical parameters in an ExB simulation configuration. As the OPT-DMD can be constrained to produce stable reduced-order models (ROMs) by construction, in this paper, we extend the application of the OPT-DMD and investigate the capabilities of the linear ROM from this algorithm toward forecasting in time of the plasma dynamics in configurations representative of the radial-azimuthal and axial-azimuthal cross-sections of a Hall thruster and over a range of simulation parameters in each test case. The predictive capacity of the OPT-DMD ROM is assessed primarily in terms of short-term dynamics forecast or, in other words, for large ratios of training-to-test data. However, the utility of the ROM for long-term dynamics forecasting is also presented for an example case in the radial-azimuthal configuration. The model's predictive performance is heterogeneous across various test cases. Nonetheless, a remarkable predictiveness is observed in the test cases that do not exhibit highly transient behaviors. Moreover, in all investigated cases, the error between the ground-truth and the reconstructed data from the OPT-DMD ROM remains bounded over time within both the training and the test window. As a result, despite its limitation in terms of generalized applicability to all plasma conditions, the OPT-DMD is proven as a reliable method to develop low computational cost and highly predictive data-driven reduced-order models in systems with a quasi-periodic global evolution of the plasma state.

46.5LGMar 16
OpenReservoirComputing: GPU-Accelerated Reservoir Computing in JAX

Jan Williams, Dima Tretiak, Steven L. Brunton et al.

OpenReservoirComputing (ORC) is a Python library for reservoir computing (RC) written in JAX (Bradbury et al. 2018) and Equinox (Kidger and Garcia 2021). JAX is a Python library for high-performance numerical computing that enables automatic differentiation, just-in-time (JIT) compilation, and GPU/TPU acceleration, while Equinox is a neural network framework for JAX. RC is a form of machine learning that functions by lifting a low-dimensional sequence or signal into a high-dimensional dynamical system and training a simple, linear readout layer from the high-dimensional dynamics back to a lower-dimensional quantity of interest. The most common application of RC is time-series forecasting, where the goal is to predict a signal's future evolution. RC has achieved state-of-the-art performance on this task, particularly when applied to chaotic dynamical systems. In addition, RC approaches can be adapted to perform classification and control tasks. ORC provides both modular components for building custom RC models and built-in models for forecasting, classification, and control. By building on JAX and Equinox, ORC offers GPU acceleration, JIT compilation, and automatic vectorization. These capabilities make prototyping new models faster and enable larger and more powerful reservoir architectures. End-to-end differentiability also enables seamless integration with other deep learning models built with Equinox.

LGJul 20, 2023
Leveraging arbitrary mobile sensor trajectories with shallow recurrent decoder networks for full-state reconstruction

Megan R. Ebers, Jan P. Williams, Katherine M. Steele et al.

Sensing is one of the most fundamental tasks for the monitoring, forecasting and control of complex, spatio-temporal systems. In many applications, a limited number of sensors are mobile and move with the dynamics, with examples including wearable technology, ocean monitoring buoys, and weather balloons. In these dynamic systems (without regions of statistical-independence), the measurement time history encodes a significant amount of information that can be extracted for critical tasks. Most model-free sensing paradigms aim to map current sparse sensor measurements to the high-dimensional state space, ignoring the time-history all together. Using modern deep learning architectures, we show that a sequence-to-vector model, such as an LSTM (long, short-term memory) network, with a decoder network, dynamic trajectory information can be mapped to full state-space estimates. Indeed, we demonstrate that by leveraging mobile sensor trajectories with shallow recurrent decoder networks, we can train the network (i) to accurately reconstruct the full state space using arbitrary dynamical trajectories of the sensors, (ii) the architecture reduces the variance of the mean-square error of the reconstruction error in comparison with immobile sensors, and (iii) the architecture also allows for rapid generalization (parameterization of dynamics) for data outside the training set. Moreover, the path of the sensor can be chosen arbitrarily, provided training data for the spatial trajectory of the sensor is available. The exceptional performance of the network architecture is demonstrated on three applications: turbulent flows, global sea-surface temperature data, and human movement biomechanics.

OCJan 11, 2020
Optimal Sensor and Actuator Selection using Balanced Model Reduction

Krithika Manohar, J. Nathan Kutz, Steven L. Brunton

Optimal sensor and actuator selection is a central challenge in high-dimensional estimation and control. Nearly all subsequent control decisions are affected by these sensor/actuator locations, and optimal placement amounts to an intractable brute-force search among the combinatorial possibilities. In this work, we exploit balanced model reduction and greedy optimization to efficiently determine sensor and actuator selections that optimize observability and controllability. In particular, we determine locations that optimize scalar measures of observability and controllability via greedy matrix QR pivoting on the dominant modes of the direct and adjoint balancing transformations. Pivoting runtime scales linearly with the state dimension, making this method tractable for high-dimensional systems. The results are demonstrated on the linearized Ginzburg-Landau system, for which our algorithm approximates known optimal placements computed using costly gradient descent methods.

NANov 7, 2016
Randomized Model Order Reduction

Alessandro Alla, J. Nathan Kutz

Singular value decomposition (SVD) has a crucial role in model order reduction. It is often utilized in the offline stage to compute basis functions that project the high-dimensional nonlinear problem into a low-dimensionsl model which is, then, evaluated cheaply. It constitutes a building block for many techniques such as e.g. Proper Orthogonal Decomposition and Dynamic Mode Decomposition. The aim of this work is to provide efficient computation of the basis functions via randomized matrix decompositions. This is possible due to the randomized Singular Value Decomposition (rSVD) which is a fast and accurate alternative of the SVD. Although this is considered as offline stage, this computation may be extremely expensive and therefore the use of compressed techniques drastically reduce its cost. Numerical examples show the effectiveness of the method for both Proper Orthogonal Decomposition (POD) and Dynamic Mode Decomposition (DMD).

STAT-MECHJul 21, 2023
Data-Induced Interactions of Sparse Sensors Using Statistical Physics

Andrei A. Klishin, J. Nathan Kutz, Krithika Manohar

Large-dimensional empirical data in science and engineering frequently have a low-rank structure and can be represented as a combination of just a few eigenmodes. Because of this structure, we can use just a few spatially localized sensor measurements to reconstruct the full state of a complex system. The quality of this reconstruction, especially in the presence of sensor noise, depends significantly on the spatial configuration of the sensors. Multiple algorithms based on gappy interpolation and QR factorization have been proposed to optimize sensor placement. Here, instead of an algorithm that outputs a single "optimal" sensor configuration, we take a statistical mechanics view to compute the full landscape of sensor interactions induced by the training data. The two key advances of this paper are the recasting of the sensor placement landscape in an Ising model form and a regularized reconstruction that significantly decreases reconstruction error for few sensors. In addition, we provide first uncertainty quantification of the sparse sensing reconstruction and open questions about the shape of reconstruction risk curve. Mapping out these data-induced sensor interactions allows combining them with external selection criteria and anticipating sensor replacement impacts.

LGNov 12, 2023
Attention for Causal Relationship Discovery from Biological Neural Dynamics

Ziyu Lu, Anika Tabassum, Shruti Kulkarni et al.

This paper explores the potential of the transformer models for learning Granger causality in networks with complex nonlinear dynamics at every node, as in neurobiological and biophysical networks. Our study primarily focuses on a proof-of-concept investigation based on simulated neural dynamics, for which the ground-truth causality is known through the underlying connectivity matrix. For transformer models trained to forecast neuronal population dynamics, we show that the cross attention module effectively captures the causal relationship among neurons, with an accuracy equal or superior to that for the most popular Granger causality analysis method. While we acknowledge that real-world neurobiology data will bring further challenges, including dynamic connectivity and unobserved variability, this research offers an encouraging preliminary glimpse into the utility of the transformer model for causal representation learning in neuroscience.

AISep 21, 2024
AI Assistants for Spaceflight Procedures: Combining Generative Pre-Trained Transformer and Retrieval-Augmented Generation on Knowledge Graphs With Augmented Reality Cues

Oliver Bensch, Leonie Bensch, Tommy Nilsson et al.

This paper describes the capabilities and potential of the intelligent personal assistant (IPA) CORE (Checklist Organizer for Research and Exploration), designed to support astronauts during procedures onboard the International Space Station (ISS), the Lunar Gateway station, and beyond. We reflect on the importance of a reliable and flexible assistant capable of offline operation and highlight the usefulness of audiovisual interaction using augmented reality elements to intuitively display checklist information. We argue that current approaches to the design of IPAs in space operations fall short of meeting these criteria. Therefore, we propose CORE as an assistant that combines Knowledge Graphs (KGs), Retrieval-Augmented Generation (RAG) for a Generative Pre-Trained Transformer (GPT), and Augmented Reality (AR) elements to ensure an intuitive understanding of procedure steps, reliability, offline availability, and flexibility in terms of response style and procedure updates.

95.6SYMar 10
Experimental Modal Analysis for engineering structures via time-delay Dynamic Mode Decomposition with Control

Yanxin Si, Bayu Jayawardhana, J. Nathan Kutz et al.

Experimental Modal Analysis (EMA) has been widely used to identify structural dynamic properties, including natural frequencies, damping ratios, and mode shapes, for structural integrity assessment. The Poly-reference Least Squares Complex Frequency (pLSCF) method is one of the most widely adopted approaches for EMA because of its strong ability to separate closely spaced modes and its robustness to measurement noise. However, pLSCF-based EMA is generally limited to low-dimensional cases with a small number of measurement points, as its computational cost increases rapidly for high-dimensional or continuous structural measurements, particularly with increasing model order. To overcome this limitation, this paper develops a high-dimensional EMA framework based on Dynamic Mode Decomposition with control (DMDc), a powerful data-driven technique originally developed in fluid dynamics, for modal identification under high-dimensional measurement scenarios. Specifically, the relationship between pLSCF and time-delay DMDc is clarified through the discrete state-space representation of the auto-regressive with exogenous inputs (ARX) model for linear systems. By showing that both methods describe the same physical dynamics of the structure, this study provides a physics-based rationale for applying time-delay DMDc to EMA. The capability and advantages of time-delay DMDc for modal parameter identification in both low- and high-dimensional measurements are validated through numerical simulations of a 6-DOF system and experiments on a cantilever beam using a digital camera. The results demonstrate that time-delay DMDc enables robust and reliable modal parameter identification, effectively addressing high-dimensional EMA problems that are difficult for conventional pLSCF and highlighting its potential for real-world structural dynamics applications.

LGJan 28Code
Cheap2Rich: A Multi-Fidelity Framework for Data Assimilation and System Identification of Multiscale Physics -- Rotating Detonation Engines

Yuxuan Bao, Jan Zajac, Megan Powers et al.

Bridging the sim2real gap between computationally inexpensive models and complex physical systems remains a central challenge in machine learning applications to engineering problems, particularly in multi-scale settings where reduced-order models typically capture only dominant dynamics. In this work, we present Cheap2Rich, a multi-scale data assimilation framework that reconstructs high-fidelity state spaces from sparse sensor histories by combining a fast low-fidelity prior with learned, interpretable discrepancy corrections. We demonstrate the performance on rotating detonation engines (RDEs), a challenging class of systems that couple detonation-front propagation with injector-driven unsteadiness, mixing, and stiff chemistry across disparate scales. Our approach successfully reconstructs high-fidelity RDE states from sparse measurements while isolating physically meaningful discrepancy dynamics associated with injector-driven effects. The results highlight a general multi-fidelity framework for data assimilation and system identification in complex multi-scale systems, enabling rapid design exploration and real-time monitoring and control while providing interpretable discrepancy dynamics. Code for this project is is available at: github.com/kro0l1k/Cheap2Rich.

LGSep 18, 2022
Koopman-theoretic Approach for Identification of Exogenous Anomalies in Nonstationary Time-series Data

Alex Mallen, Christoph A. Keller, J. Nathan Kutz

In many scenarios, it is necessary to monitor a complex system via a time-series of observations and determine when anomalous exogenous events have occurred so that relevant actions can be taken. Determining whether current observations are abnormal is challenging. It requires learning an extrapolative probabilistic model of the dynamics from historical data, and using a limited number of current observations to make a classification. We leverage recent advances in long-term probabilistic forecasting, namely {\em Deep Probabilistic Koopman}, to build a general method for classifying anomalies in multi-dimensional time-series data. We also show how to utilize models with domain knowledge of the dynamics to reduce type I and type II error. We demonstrate our proposed method on the important real-world task of global atmospheric pollution monitoring, integrating it with NASA's Global Earth System Model. The system successfully detects localized anomalies in air quality due to events such as COVID-19 lockdowns and wildfires.

86.5COMP-PHMay 22
Multi-Fidelity Learning with Shallow Recurrent Decoders for Reactor Physics

Stefano Riva, Carolina Introini, J. Nathan Kutz et al.

In reactor physics, neutronics can be treated with different fidelity levels, according to the needs of the user. On one hand, the precise modeling of neutrons' behaviour in reactor physics is often expensive and time-consuming due to the high computational costs to numerically solve the Boltzmann transport equation. Conversely, by adopting suitable assumptions, such as the SP$_N$, diffusion theory, and point kinetics, it is possible to generate efficiently low-fidelity data. From the perspective of surrogate models, this computational limitation translates into a scarcity of high-fidelity data and a significant amount of low-fidelity data. Given this difference in fidelity levels, it would be interesting to develop a suitable procedure to map low-fidelity models towards higher fidelity models; for instance, one could obtain the solution to a multi-group diffusion equation starting from time-series data obtained from a point kinetics model. Indeed, this work investigates this possibility by leveraging multi-fidelity information with Shallow Recurrent Decoders, a novel machine learning architecture able to map time-series observations to the full state of the reactor. This technique has been designed to use local or global measurements as input and map their temporal trajectories to the high-dimensional state; by the same logic, in principle, this architecture can also be used when the input is formed by the solution of a lumped model. This work applies this idea to a benchmark reactor geometry, mapping the point kinetics model to the diffusion solution under various input conditions, with much less computational costs.

CEDec 2, 2025
Common Task Framework For a Critical Evaluation of Scientific Machine Learning Algorithms

Philippe Martin Wyder, Judah Goldfeder, Alexey Yermakov et al.

Machine learning (ML) is transforming modeling and control in the physical, engineering, and biological sciences. However, rapid development has outpaced the creation of standardized, objective benchmarks - leading to weak baselines, reporting bias, and inconsistent evaluations across methods. This undermines reproducibility, misguides resource allocation, and obscures scientific progress. To address this, we propose a Common Task Framework (CTF) for scientific machine learning. The CTF features a curated set of datasets and task-specific metrics spanning forecasting, state reconstruction, and generalization under realistic constraints, including noise and limited data. Inspired by the success of CTFs in fields like natural language processing and computer vision, our framework provides a structured, rigorous foundation for head-to-head evaluation of diverse algorithms. As a first step, we benchmark methods on two canonical nonlinear systems: Kuramoto-Sivashinsky and Lorenz. These results illustrate the utility of the CTF in revealing method strengths, limitations, and suitability for specific classes of problems and diverse objectives. Next, we are launching a competition around a global real world sea surface temperature dataset with a true holdout dataset to foster community engagement. Our long-term vision is to replace ad hoc comparisons with standardized evaluations on hidden test sets that raise the bar for rigor and reproducibility in scientific ML.

77.7LGMay 15
CTF4Nuclear: Common Task Framework for Nuclear Fission and Fusion Models

Stefano Riva, Carolina Introini, Antonio Cammi et al.

The demand for clean energy is ever increasing, with new nuclear technologies presenting a complementary solution to renewable energies. However, designing and operating these systems is exceptionally difficult, given the complexity of the physical phenomena that interact to form the system dynamics. While high-fidelity simulations help to understand the non-linear, multi-physics interactions within a reactor, they are computationally expensive and rarely suitable for real-time applications. Furthermore, model-based approaches are inherently sensitive to simplifying assumptions required to derive their governing equations and parameters, leading to inevitable discrepancies with real-world measurements. In contrast, Machine Learning (ML) methods have the potential to generate reliable surrogate models which may be able to quickly predict the system's behaviour. However, the number of data-driven methods that can potentially be used for this task is large and diverse. In a safety-critical setting such as nuclear engineering, a fair comparison of different ML methods, and a clear understanding of their advantages and limitations, is of paramount importance. To address this, we introduce a Common Task Framework (CTF) for ML in nuclear engineering, building upon previous efforts in dynamical systems and seismology. This CTF considers a curated set of datasets from different nuclear and nuclear-adjacent systems. The CTF evaluates the performance of a method on 12 established metrics, alongside a new paradigm focused on system monitoring from sparse measurements only. We illustrate the framework by benchmarking standard ML baselines against these datasets, revealing current method limitations. Our vision is to replace ad hoc comparisons with standardized evaluations on hidden test sets, raising the bar for rigour and reproducibility in scientific ML for the nuclear industry.

LGDec 22, 2025
The Seismic Wavefield Common Task Framework

Alexey Yermakov, Yue Zhao, Marine Denolle et al.

Seismology faces fundamental challenges in state forecasting and reconstruction (e.g., earthquake early warning and ground motion prediction) and managing the parametric variability of source locations, mechanisms, and Earth models (e.g., subsurface structure and topography effects). Addressing these with simulations is hindered by their massive scale, both in synthetic data volumes and numerical complexity, while real-data efforts are constrained by models that inadequately reflect the Earth's complexity and by sparse sensor measurements from the field. Recent machine learning (ML) efforts offer promise, but progress is obscured by a lack of proper characterization, fair reporting, and rigorous comparisons. To address this, we introduce a Common Task Framework (CTF) for ML for seismic wavefields, starting with three distinct wavefield datasets. Our CTF features a curated set of datasets at various scales (global, crustal, and local) and task-specific metrics spanning forecasting, reconstruction, and generalization under realistic constraints such as noise and limited data. Inspired by CTFs in fields like natural language processing, this framework provides a structured and rigorous foundation for head-to-head algorithm evaluation. We illustrate the evaluation procedure with scores reported for two of the datasets, showcasing the performance of various methods and foundation models for reconstructing seismic wavefields from both simulated and real-world sensor measurements. The CTF scores reveal the strengths, limitations, and suitability for specific problem classes. Our vision is to replace ad hoc comparisons with standardized evaluations on hidden test sets, raising the bar for rigor and reproducibility in scientific ML.

QMFeb 22
An Interpretable Data-Driven Model of the Flight Dynamics of Hawks

Lydia France, Karl Lapo, J. Nathan Kutz

Despite significant analysis of bird flight, generative physics models for flight dynamics do not currently exist. Yet the underlying mechanisms responsible for various flight manoeuvres are important for understanding how agile flight can be accomplished. Even in a simple flight, multiple objectives are at play, complicating analysis of the overall flight mechanism. Using the data-driven method of dynamic mode decomposition (DMD) on motion capture recordings of hawks, we show that multiple behavioral states such as flapping, turning, landing, and gliding, can be modeled by simple and interpretable modal structures (i.e. the underlying wing-tail shape) which can be linearly combined to reproduce the experimental flight observations. Moreover, the DMD model can be used to extrapolate naturalistic flapping. Flight is highly individual, with differences in style across the hawks, but we find they share a common set of dynamic modes. The DMD model is a direct fit to data, unlike traditional models constructed from physics principles which can rarely be tested on real data and whose assumptions are typically invalid in real flight. The DMD approach gives a highly accurate reconstruction of the flight dynamics with only three parameters needed to characterize flapping, and a fourth to integrate turning manoeuvres. The DMD analysis further shows that the underlying mechanism of flight, much like simplest walking models, displays a parametric coupling between dominant modes suggesting efficiency for locomotion.

52.8LGApr 1
UQ-SHRED: uncertainty quantification of shallow recurrent decoder networks for sparse sensing via engression

Mars Liyao Gao, Yuxuan Bao, Amy S. Rude et al.

Reconstructing high-dimensional spatiotemporal fields from sparse sensor measurements is critical in a wide range of scientific applications. The SHallow REcurrent Decoder (SHRED) architecture is a recent state-of-the-art architecture that reconstructs high-quality spatial domain from hyper-sparse sensor measurement streams. An important limitation of SHRED is that in complex, data-scarce, high-frequency, or stochastic systems, portions of the spatiotemporal field must be modeled with valid uncertainty estimation. We introduce UQ-SHRED, a distributional learning framework for sparse sensing problems that provides uncertainty quantification through a neural network-based distributional regression called engression. UQ-SHRED models the uncertainty by learning the predictive distribution of the spatial state conditioned on the sensor history. By injecting stochastic noise into sensor inputs and training with an energy score loss, UQ-SHRED produces predictive distributions with minimal computational overhead, requiring only noise injection at the input and resampling through a single architecture without retraining or additional network structures. On complicated synthetic and real-life datasets including turbulent flow, atmospheric dynamics, neuroscience and astrophysics, UQ-SHRED provides a distributional approximation with well-calibrated confidence intervals. We further conduct ablation studies to understand how each model setting affects the quality of the UQ-SHRED performance, and its validity on uncertainty quantification over a set of different experimental setups.

LGJan 29
SENDAI: A Hierarchical Sparse-measurement, EfficieNt Data AssImilation Framework

Xingyue Zhang, Yuxuan Bao, Mars Liyao Gao et al.

Bridging the gap between data-rich training regimes and observation-sparse deployment conditions remains a central challenge in spatiotemporal field reconstruction, particularly when target domains exhibit distributional shifts, heterogeneous structure, and multi-scale dynamics absent from available training data. We present SENDAI, a hierarchical Sparse-measurement, EfficieNt Data AssImilation Framework that reconstructs full spatial states from hyper sparse sensor observations by combining simulation-derived priors with learned discrepancy corrections. We demonstrate the performance on satellite remote sensing, reconstructing MODIS (Moderate Resolution Imaging Spectroradiometer) derived vegetation index fields across six globally distributed sites. Using seasonal periods as a proxy for domain shift, the framework consistently outperforms established baselines that require substantially denser observations -- SENDAI achieves a maximum SSIM improvement of 185% over traditional baselines and a 36% improvement over recent high-frequency-based methods. These gains are particularly pronounced for landscapes with sharp boundaries and sub-seasonal dynamics; more importantly, the framework effectively preserves diagnostically relevant structures -- such as field topologies, land cover discontinuities, and spatial gradients. By yielding corrections that are more structurally and spectrally separable, the reconstructed fields are better suited for downstream inference of indirectly observed variables. The results therefore highlight a lightweight and operationally viable framework for sparse-measurement reconstruction that is applicable to physically grounded inference, resource-limited deployment, and real-time monitor and control.

LGJul 28, 2025Code
PySHRED: A Python package for SHallow REcurrent Decoding for sparse sensing, model reduction and scientific discovery

David Ye, Jan Williams, Mars Gao et al. · uw

SHallow REcurrent Decoders (SHRED) provide a deep learning strategy for modeling high-dimensional dynamical systems and/or spatiotemporal data from dynamical system snapshot observations. PySHRED is a Python package that implements SHRED and several of its major extensions, including for robust sensing, reduced order modeling and physics discovery. In this paper, we introduce the version 1.0 release of PySHRED, which includes data preprocessors and a number of cutting-edge SHRED methods specifically designed to handle real-world data that may be noisy, multi-scale, parameterized, prohibitively high-dimensional, and strongly nonlinear. The package is easy to install, thoroughly-documented, supplemented with extensive code examples, and modularly-structured to support future additions. The entire codebase is released under the MIT license and is available at https://github.com/pyshred-dev/pyshred.

SPFeb 20, 2021Code
PySensors: A Python Package for Sparse Sensor Placement

Brian M. de Silva, Krithika Manohar, Emily Clark et al.

PySensors is a Python package for selecting and placing a sparse set of sensors for classification and reconstruction tasks. Specifically, PySensors implements algorithms for data-driven sparse sensor placement optimization for reconstruction (SSPOR) and sparse sensor placement optimization for classification (SSPOC). In this work we provide a brief description of the mathematical algorithms and theory for sparse sensor optimization, along with an overview and demonstration of the features implemented in PySensors (with code examples). We also include practical advice for user and a list of potential extensions to PySensors. Software is available at https://github.com/dynamicslab/pysensors.

ITDec 17, 2025
Information theory and discriminative sampling for model discovery

Yuxuan Bao, J. Nathan Kutz

Fisher information and Shannon entropy are fundamental tools for understanding and analyzing dynamical systems from complementary perspectives. They can characterize unknown parameters by quantifying the information contained in variables, or measure how different initial trajectories or temporal segments of a trajectory contribute to learning or inferring system dynamics. In this work, we leverage the Fisher Information Matrix (FIM) within the data-driven framework of {\em sparse identification of nonlinear dynamics} (SINDy). We visualize information patterns in chaotic and non-chaotic systems for both single trajectories and multiple initial conditions, demonstrating how information-based analysis can improve sampling efficiency and enhance model performance by prioritizing more informative data. The benefits of statistical bagging are further elucidated through spectral analysis of the FIM. We also illustrate how Fisher information and entropy metrics can promote data efficiency in three scenarios: when only a single trajectory is available, when a tunable control parameter exists, and when multiple trajectories can be freely initialized. As data-driven model discovery continues to gain prominence, principled sampling strategies guided by quantifiable information metrics offer a powerful approach for improving learning efficiency and reducing data requirements.

LGDec 1, 2025
Data assimilation and discrepancy modeling with shallow recurrent decoders

Yuxuan Bao, J. Nathan Kutz

The requirements of modern sensing are rapidly evolving, driven by increasing demands for data efficiency, real-time processing, and deployment under limited sensing coverage. Complex physical systems are often characterized through the integration of a limited number of point sensors in combination with scientific computations which approximate the dominant, full-state dynamics. Simulation models, however, inevitably neglect small-scale or hidden processes, are sensitive to perturbations, or oversimplify parameter correlations, leading to reconstructions that often diverge from the reality measured by sensors. This creates a critical need for data assimilation, the process of integrating observational data with predictive simulation models to produce coherent and accurate estimates of the full state of complex physical systems. We propose a machine learning framework for Data Assimilation with a SHallow REcurrent Decoder (DA-SHRED) which bridges the simulation-to-real (SIM2REAL) gap between computational modeling and experimental sensor data. For real-world physics systems modeling high-dimensional spatiotemporal fields, where the full state cannot be directly observed and must be inferred from sparse sensor measurements, we leverage the latent space learned from a reduced simulation model via SHRED, and update these latent variables using real sensor data to accurately reconstruct the full system state. Furthermore, our algorithm incorporates a sparse identification of nonlinear dynamics based regression model in the latent space to identify functionals corresponding to missing dynamics in the simulation model. We demonstrate that DA-SHRED successfully closes the SIM2REAL gap and additionally recovers missing dynamics in highly complex systems, demonstrating that the combination of efficient temporal encoding and physics-informed correction enables robust data assimilation.

PLASM-PHMay 20, 2024
Shallow Recurrent Decoder for Reduced Order Modeling of Plasma Dynamics

J. Nathan Kutz, Maryam Reza, Farbod Faraji et al.

Reduced order models are becoming increasingly important for rendering complex and multiscale spatio-temporal dynamics computationally tractable. The computational efficiency of such surrogate models is especially important for design, exhaustive exploration and physical understanding. Plasma simulations, in particular those applied to the study of ${\bf E}\times {\bf B}$ plasma discharges and technologies, such as Hall thrusters, require substantial computational resources in order to resolve the multidimentional dynamics that span across wide spatial and temporal scales. Although high-fidelity computational tools are available to simulate such systems over limited conditions and in highly simplified geometries, simulations of full-size systems and/or extensive parametric studies over many geometric configurations and under different physical conditions are computationally intractable with conventional numerical tools. Thus, scientific studies and industrially oriented modeling of plasma systems, including the important ${\bf E}\times {\bf B}$ technologies, stand to significantly benefit from reduced order modeling algorithms. We develop a model reduction scheme based upon a {\em Shallow REcurrent Decoder} (SHRED) architecture. The scheme uses a neural network for encoding limited sensor measurements in time (sequence-to-sequence encoding) to full state-space reconstructions via a decoder network. Based upon the theory of separation of variables, the SHRED architecture is capable of (i) reconstructing full spatio-temporal fields with as little as three point sensors, even the fields that are not measured with sensor feeds but that are in dynamic coupling with the measured field, and (ii) forecasting the future state of the system using neural network roll-outs from the trained time encoding model.

LGJan 23, 2025
Sparse identification of nonlinear dynamics and Koopman operators with Shallow Recurrent Decoder Networks

Mars Liyao Gao, Jan P. Williams, J. Nathan Kutz · uw

Modeling real-world spatio-temporal data is exceptionally difficult due to inherent high dimensionality, measurement noise, partial observations, and often expensive data collection procedures. In this paper, we present Sparse Identification of Nonlinear Dynamics with SHallow REcurrent Decoder networks (SINDy-SHRED), a method to jointly solve the sensing and model identification problems with simple implementation, efficient computation, and robust performance. SINDy-SHRED uses Gated Recurrent Units to model the temporal sequence of sparse sensor measurements along with a shallow decoder network to reconstruct the full spatio-temporal field from the latent state space. Our algorithm introduces a SINDy-based regularization for which the latent space progressively converges to a SINDy-class functional, provided the projection remains within the set. In restricting SINDy to a linear model, a Koopman-SHRED model is generated. SINDy-SHRED (i) learns a symbolic and interpretable generative model of a parsimonious and low-dimensional latent space for the complex spatio-temporal dynamics, (ii) discovers new physics models even for well-known physical systems, (iii) achieves provably robust convergence with an observed globally convex loss landscape, and (iv) achieves superior accuracy, data efficiency, and training time, all with fewer model parameters. We conduct systematic experimental studies on PDE data such as turbulent flows, real-world sensor measurements for sea surface temperature, and direct video data. The interpretable SINDy and Koopman models of latent state dynamics enable stable and accurate long-term video predictions, outperforming all current baseline deep learning models in accuracy, training time, and data requirements, including Convolutional LSTM, PredRNN, ResNet, and SimVP.

PLASM-PHMar 3, 2024
Data-driven local operator finding for reduced-order modelling of plasma systems: II. Application to parametric dynamics

Farbod Faraji, Maryam Reza, Aaron Knoll et al.

Real-world systems often exhibit dynamics influenced by various parameters, either inherent or externally controllable, necessitating models capable of reliably capturing these parametric behaviors. Plasma technologies exemplify such systems. For example, phenomena governing global dynamics in Hall thrusters (a spacecraft propulsion technology) vary with various parameters, such as the "self-sustained electric field". In this Part II, following on the introduction of our novel data-driven local operator finding algorithm, Phi Method, in Part I, we showcase the method's effectiveness in learning parametric dynamics to predict system behavior across unseen parameter spaces. We present two adaptations: the "parametric Phi Method" and the "ensemble Phi Method", which are demonstrated through 2D fluid-flow-past-a-cylinder and 1D Hall-thruster-plasma-discharge problems. Comparative evaluation against parametric OPT-DMD in the fluid case demonstrates superior predictive performance of the parametric Phi Method. Across both test cases, parametric and ensemble Phi Method reliably recover governing parametric PDEs and offer accurate predictions over test parameters. Ensemble ROM analysis underscores Phi Method's robust learning of dominant dynamic coefficients with high confidence.

LGFeb 15, 2025
Reduced Order Modeling with Shallow Recurrent Decoder Networks

Matteo Tomasetto, Jan P. Williams, Francesco Braghin et al.

Reduced Order Modeling is of paramount importance for efficiently inferring high-dimensional spatio-temporal fields in parametric contexts, enabling computationally tractable parametric analyses, uncertainty quantification and control. However, conventional dimensionality reduction techniques are typically limited to known and constant parameters, inefficient for nonlinear and chaotic dynamics, and uninformed to the actual system behavior. In this work, we propose sensor-driven SHallow REcurrent Decoder networks for Reduced Order Modeling (SHRED-ROM). Specifically, we consider the composition of a long short-term memory network, which encodes the temporal dynamics of limited sensor data in multiple scenarios, and a shallow decoder, which reconstructs the corresponding high-dimensional states. SHRED-ROM is a robust decoding-only strategy that circumvents the numerically unstable approximation of an inverse which is required by encoding-decoding schemes. To enhance computational efficiency and memory usage, the full-order state snapshots are reduced by, e.g., proper orthogonal decomposition, allowing for compressive training of the networks with minimal hyperparameter tuning. Through applications on chaotic and nonlinear fluid dynamics, we show that SHRED-ROM (i) accurately reconstructs the state dynamics for new parameter values starting from limited fixed or mobile sensors, independently on sensor placement, (ii) can cope with both physical, geometrical and time-dependent parametric dependencies, while being agnostic to their actual values, (iii) can accurately estimate unknown parameters, and (iv) can deal with different data sources, such as high-fidelity simulations, coupled fields and videos.

PLASM-PHMar 3, 2024
Data-driven local operator finding for reduced-order modelling of plasma systems: I. Concept and verifications

Farbod Faraji, Maryam Reza, Aaron Knoll et al.

Reduced-order plasma models that can efficiently predict plasma behavior across various settings and configurations are highly sought after yet elusive. The demand for such models has surged in the past decade due to their potential to facilitate scientific research and expedite the development of plasma technologies. In line with the advancements in computational power and data-driven methods, we introduce the "Phi Method" in this two-part article. Part I presents this novel algorithm, which employs constrained regression on a candidate term library informed by numerical discretization schemes to discover discretized systems of differential equations. We demonstrate Phi Method's efficacy in deriving reliable and robust reduced-order models (ROMs) for three test cases: the Lorenz attractor, flow past a cylinder, and a 1D Hall-thruster-representative plasma. Part II will delve into the method's application for parametric dynamics discovery. Our results show that ROMs derived from the Phi Method provide remarkably accurate predictions of systems' behavior, whether derived from steady-state or transient-state data. This underscores the method's potential for transforming plasma system modeling.

STAT-MECHMar 4, 2024
Statistical Mechanics of Dynamical System Identification

Andrei A. Klishin, Joseph Bakarji, J. Nathan Kutz et al.

Recovering dynamical equations from observed noisy data is the central challenge of system identification. We develop a statistical mechanics approach to analyze sparse equation discovery algorithms, which typically balance data fit and parsimony via hyperparameter tuning. In this framework, statistical mechanics offers tools to analyze the interplay between complexity and fitness similarly to that of entropy and energy in physical systems. To establish this analogy, we define the hyperparameter optimization procedure as a two-level Bayesian inference problem that separates variable selection from coefficient inference and enables the computation of the posterior parameter distribution in closed form. Our approach provides uncertainty quantification, crucial in the low-data limit that is frequently encountered in real-world applications. A key advantage of employing statistical mechanical concepts, such as free energy and the partition function, is to connect the large data limit to thermodynamic limit and characterize the sparsity- and noise-induced phase transitions that delineate correct from incorrect identification. We thus provide a method for closed-loop inference, estimating the noise in a given model and checking if the model is tolerant to that noise amount. This perspective of sparse equation discovery is versatile and can be adapted to various other equation discovery algorithms.

LGMar 11, 2025
From Models To Experiments: Shallow Recurrent Decoder Networks on the DYNASTY Experimental Facility

Carolina Introini, Stefano Riva, J. Nathan Kutz et al.

The Shallow Recurrent Decoder networks are a novel paradigm recently introduced for state estimation, combining sparse observations with high-dimensional model data. This architecture features important advantages compared to standard data-driven methods including: the ability to use only three sensors (even randomly selected) for reconstructing the entire dynamics of a physical system; the ability to train on compressed data spanned by a reduced basis; the ability to measure a single field variable (easy to measure) and reconstruct coupled spatio-temporal fields that are not observable and minimal hyper-parameter tuning. This approach has been verified on different test cases within different fields including nuclear reactors, even though an application to a real experimental facility, adopting the employment of in-situ observed quantities, is missing. This work aims to fill this gap by applying the Shallow Recurrent Decoder architecture to the DYNASTY facility, built at Politecnico di Milano, which studies the natural circulation established by internally heated fluids for Generation IV applications, especially in the case of Circulating Fuel reactors. The RELAP5 code is used to generate the high-fidelity data, and temperature measurements extracted by the facility are used as input for the state estimation. The results of this work will provide a validation of the Shallow Recurrent Decoder architecture to engineering systems, showing the capabilities of this approach to provide and accurate state estimation.