FLU-DYNMay 26
Sparse POD Mode Selection and Manifold Dimensionality Reduction with Neural NetworksTomoki Koike, Prakash Mohan, Marc T. Henry de Frahan et al.
High-performance computing enables simulation of high-dimensional physical systems, but downstream analyses such as inverse problems and control remain computationally expensive, motivating model order reduction (MOR) to construct efficient low-dimensional surrogates. Proper Orthogonal Decomposition (POD), a widely adopted data-driven MOR method, projects dynamics onto linear subspaces spanned by the most energetic modes. However, POD struggles for problems with slowly decaying Kolmogorov \(n\)-widths, such as advection-dominated and turbulent flows, requiring many modes for accurate reconstruction. Moreover, energy-based selection can discard crucial low-energy modes needed to capture small-scale features. Recent nonlinear manifold methods using polynomial mappings with alternating or greedy mode selection achieve better reconstruction with fewer modes. However, these methods fix the nonlinear mapping form a priori, limiting expressivity. Conversely, neural network (NN) manifolds offer greater expressivity but employ energy-based selection. We present SparseModesNet, a dimensionality reduction framework that employs linear encoding via POD modes and nonlinear NN decoding. The decoder leverages LassoNet, a method enforcing hierarchical sparsity through residual connections with linear skip layers, to simultaneously select informative POD modes and learn a nonlinear mapping that minimizes reconstruction error. On benchmark advection-dominated and chaotic flows, SparseModesNet matches or exceeds state-of-the-art performance. For turbulent channel flow at friction Reynolds number \(Re_τ=5200\), we reduce reconstruction error by 51--78\% compared to existing polynomial manifold methods while maintaining interpretability through physically meaningful mode selection.
AO-PHNov 30, 2022
Statistical treatment of convolutional neural network super-resolution of inland surface wind for subgrid-scale variability quantificationDaniel Getter, Julie Bessac, Johann Rudi et al.
Machine learning models have been employed to perform either physics-free data-driven or hybrid dynamical downscaling of climate data. Most of these implementations operate over relatively small downscaling factors because of the challenge of recovering fine-scale information from coarse data. This limits their compatibility with many global climate model outputs, often available between $\sim$50--100 km resolution, to scales of interest such as cloud resolving or urban scales. This study systematically examines the capability of convolutional neural networks (CNNs) to downscale surface wind speed data over land surface from different coarse resolutions (25 km, 48 km, and 100 km resolution) to 3 km. For each downscaling factor, we consider three CNN configurations that generate super-resolved predictions of fine-scale wind speed, which take between 1 to 3 input fields: coarse wind speed, fine-scale topography, and diurnal cycle. In addition to fine-scale wind speeds, probability density function parameters are generated, through which sample wind speeds can be generated accounting for the intrinsic stochasticity of wind speed. For generalizability assessment, CNN models are tested on regions with different topography and climate that are unseen during training. The evaluation of super-resolved predictions focuses on subgrid-scale variability and the recovery of extremes. Models with coarse wind and fine topography as inputs exhibit the best performance compared with other model configurations, operating across the same downscaling factor. Our diurnal cycle encoding results in lower out-of-sample generalizability compared with other input configurations.
MLFeb 26
Uncovering Physical Drivers of Dark Matter Halo Structures with Auxiliary-Variable-Guided Generative ModelsArkaprabha Ganguli, Anirban Samaddar, Florian Kéruzoré et al.
Deep generative models (DGMs) compress high-dimensional data but often entangle distinct physical factors in their latent spaces. We present an auxiliary-variable-guided framework for disentangling representations of thermal Sunyaev-Zel'dovich (tSZ) maps of dark matter halos. We introduce halo mass and concentration as auxiliary variables and apply a lightweight alignment penalty to encourage latent dimensions to reflect these physical quantities. To generate sharp and realistic samples, we extend latent conditional flow matching (LCFM), a state-of-the-art generative model, to enforce disentanglement in the latent space. Our Disentangled Latent-CFM (DL-CFM) model recovers the established mass-concentration scaling relation and identifies latent space outliers that may correspond to unusual halo formation histories. By linking latent coordinates to interpretable astrophysical properties, our method transforms the latent space into a diagnostic tool for cosmological structure. This work demonstrates that auxiliary guidance preserves generative flexibility while yielding physically meaningful, disentangled embeddings, providing a generalizable pathway for uncovering independent factors in complex astronomical datasets.
MLSep 26, 2025
Multi-modal Bayesian Neural Network Surrogates with Conjugate Last-Layer EstimationIan Taylor, Juliane Mueller, Julie Bessac
As data collection and simulation capabilities advance, multi-modal learning, the task of learning from multiple modalities and sources of data, is becoming an increasingly important area of research. Surrogate models that learn from data of multiple auxiliary modalities to support the modeling of a highly expensive quantity of interest have the potential to aid outer loop applications such as optimization, inverse problems, or sensitivity analyses when multi-modal data are available. We develop two multi-modal Bayesian neural network surrogate models and leverage conditionally conjugate distributions in the last layer to estimate model parameters using stochastic variational inference (SVI). We provide a method to perform this conjugate SVI estimation in the presence of partially missing observations. We demonstrate improved prediction accuracy and uncertainty quantification compared to uni-modal surrogate models for both scalar and time series data.
LGAug 21, 2025
Multidimensional Distributional Neural Network Output Demonstrated in Super-Resolution of Surface Wind SpeedHarrison J. Goldwyn, Mitchell Krock, Johann Rudi et al.
Accurate quantification of uncertainty in neural network predictions remains a central challenge for scientific applications involving high-dimensional, correlated data. While existing methods capture either aleatoric or epistemic uncertainty, few offer closed-form, multidimensional distributions that preserve spatial correlation while remaining computationally tractable. In this work, we present a framework for training neural networks with a multidimensional Gaussian loss, generating closed-form predictive distributions over outputs with non-identically distributed and heteroscedastic structure. Our approach captures aleatoric uncertainty by iteratively estimating the means and covariance matrices, and is demonstrated on a super-resolution example. We leverage a Fourier representation of the covariance matrix to stabilize network training and preserve spatial correlation. We introduce a novel regularization strategy -- referred to as information sharing -- that interpolates between image-specific and global covariance estimates, enabling convergence of the super-resolution downscaling network trained on image-specific distributional loss functions. This framework allows for efficient sampling, explicit correlation modeling, and extensions to more complex distribution families all without disrupting prediction performance. We demonstrate the method on a surface wind speed downscaling task and discuss its broader applicability to uncertainty-aware prediction in scientific models.
DSJul 28, 2025
Operator Inference Aware Quadratic Manifolds with Isotropic Reduced Coordinates for Nonintrusive Model ReductionPaul Schwerdtner, Prakash Mohan, Julie Bessac et al.
Quadratic manifolds for nonintrusive reduced modeling are typically trained to minimize the reconstruction error on snapshot data, which means that the error of models fitted to the embedded data in downstream learning steps is ignored. In contrast, we propose a greedy training procedure that takes into account both the reconstruction error on the snapshot data and the prediction error of reduced models fitted to the data. Because our procedure learns quadratic manifolds with the objective of achieving accurate reduced models, it avoids oscillatory and other non-smooth embeddings that can hinder learning accurate reduced models. Numerical experiments on transport and turbulent flow problems show that quadratic manifolds trained with the proposed greedy approach lead to reduced models with up to two orders of magnitude higher accuracy than quadratic manifolds trained with respect to the reconstruction error alone.
MLJun 30, 2025
Enhancing Interpretability in Generative Modeling: Statistically Disentangled Latent Spaces Guided by Generative Factors in Scientific DatasetsArkaprabha Ganguli, Nesar Ramachandra, Julie Bessac et al.
This study addresses the challenge of statistically extracting generative factors from complex, high-dimensional datasets in unsupervised or semi-supervised settings. We investigate encoder-decoder-based generative models for nonlinear dimensionality reduction, focusing on disentangling low-dimensional latent variables corresponding to independent physical factors. Introducing Aux-VAE, a novel architecture within the classical Variational Autoencoder framework, we achieve disentanglement with minimal modifications to the standard VAE loss function by leveraging prior statistical knowledge through auxiliary variables. These variables guide the shaping of the latent space by aligning latent factors with learned auxiliary variables. We validate the efficacy of Aux-VAE through comparative assessments on multiple datasets, including astronomical simulations.
MEJul 29, 2021
Neural Networks for Parameter Estimation in Intractable ModelsAmanda Lenzi, Julie Bessac, Johann Rudi et al.
We propose to use deep learning to estimate parameters in statistical models when standard likelihood estimation methods are computationally infeasible. We show how to estimate parameters from max-stable processes, where inference is exceptionally challenging even with small datasets but simulation is straightforward. We use data from model simulations as input and train deep neural networks to learn statistical parameters. Our neural-network-based method provides a competitive alternative to current approaches, as demonstrated by considerable accuracy and computational time improvements. It serves as a proof of concept for deep learning in statistical parameter estimation and can be extended to other estimation problems.
MLDec 12, 2020
Parameter Estimation with Dense and Convolutional Neural Networks Applied to the FitzHugh-Nagumo ODEJohann Rudi, Julie Bessac, Amanda Lenzi
Machine learning algorithms have been successfully used to approximate nonlinear maps under weak assumptions on the structure and properties of the maps. We present deep neural networks using dense and convolutional layers to solve an inverse problem, where we seek to estimate parameters of a FitzHugh-Nagumo model, which consists of a nonlinear system of ordinary differential equations (ODEs). We employ the neural networks to approximate reconstruction maps for model parameter estimation from observational data, where the data comes from the solution of the ODE and takes the form of a time series representing dynamically spiking membrane potential of a biological neuron. We target this dynamical model because of the computational challenges it poses in an inference setting, namely, having a highly nonlinear and nonconvex data misfit term and permitting only weakly informative priors on parameters. These challenges cause traditional optimization to fail and alternative algorithms to exhibit large computational costs. We quantify the prediction errors of model parameters obtained from the neural networks and investigate the effects of network architectures with and without the presence of noise in observational data. We generalize our framework for neural network-based reconstruction maps to simultaneously estimate ODE parameters and parameters of autocorrelated observational noise. Our results demonstrate that deep neural networks have the potential to estimate parameters in dynamical models and stochastic processes, and they are capable of predicting parameters accurately for the FitzHugh-Nagumo model.