CEJan 31, 2025
Physically Interpretable Representation and Controlled Generation for Turbulence DataTiffany Fan, Murray Cutforth, Marta D'Elia et al.
Computational Fluid Dynamics (CFD) plays a pivotal role in fluid mechanics, enabling precise simulations of fluid behavior through partial differential equations (PDEs). However, traditional CFD methods are resource-intensive, particularly for high-fidelity simulations of complex flows, which are further complicated by high dimensionality, inherent stochasticity, and limited data availability. This paper addresses these challenges by proposing a data-driven approach that leverages a Gaussian Mixture Variational Autoencoder (GMVAE) to encode high-dimensional scientific data into low-dimensional, physically meaningful representations. The GMVAE learns a structured latent space where data can be categorized based on physical properties such as the Reynolds number while maintaining global physical consistency. To assess the interpretability of the learned representations, we introduce a novel metric based on graph spectral theory, quantifying the smoothness of physical quantities along the latent manifold. We validate our approach using 2D Navier-Stokes simulations of flow past a cylinder over a range of Reynolds numbers. Our results demonstrate that the GMVAE provides improved clustering, meaningful latent structure, and robust generative capabilities compared to baseline dimensionality reduction methods. This framework offers a promising direction for data-driven turbulence modeling and broader applications in computational fluid dynamics and engineering systems.
LGMar 8
Generative prediction of laser-induced rocket ignition with dynamic latent space representationsTony Zahtila, Ettore Saetta, Murray Cutforth et al.
Accurate and predictive scale-resolving simulations of laser-ignited rocket engines are highly time-consuming because the problem includes turbulent fuel-oxidizer mixing dynamics, laser-induced energy deposition, and high-speed flame growth. This is conflated with the large design space primarily corresponding to the laser operating conditions and target location. To enable rapid exploration and uncertainty quantification, we propose a data-driven surrogate modeling approach that combines convolutional autoencoders (cAEs) with neural ordinary differential equations (neural ODEs). The present target application of an ML-based surrogate model to leading-edge multi-physics turbulence simulation is part of a paradigm shift in the deployment of surrogate models towards increasing real-world complexity. Sequentially, the cAE spatially compresses high-dimensional flow fields into a low-dimensional latent space, wherein the system's temporal dynamics are learned via neural ODEs. Once trained, the model generates fast spatiotemporal predictions from initial conditions and specified operating inputs. By learning a surrogate to replace the entirety of the time-evolving simulation, the cost of predicting an ignition trial is reduced by several orders of magnitude, allowing efficient exploration of the input parameter space. Further, as the current framework yields a spatiotemporal field prediction, appraisal of the model output's physical grounding is more tractable. This approach marks a significant step toward real-time digital twins for laser-ignited rocket combustors and represents surrogate modeling in a complex system context.
LGNov 26, 2025
Physically Interpretable Representation Learning with Gaussian Mixture Variational AutoEncoder (GM-VAE)Tiffany Fan, Murray Cutforth, Marta D'Elia et al.
Extracting compact, physically interpretable representations from high-dimensional scientific data is a persistent challenge due to the complex, nonlinear structures inherent in physical systems. We propose a Gaussian Mixture Variational Autoencoder (GM-VAE) framework designed to address this by integrating an Expectation-Maximization (EM)-inspired training scheme with a novel spectral interpretability metric. Unlike conventional VAEs that jointly optimize reconstruction and clustering (often leading to training instability), our method utilizes a block-coordinate descent strategy, alternating between expectation and maximization steps. This approach stabilizes training and naturally aligns latent clusters with distinct physical regimes. To objectively evaluate the learned representations, we introduce a quantitative metric based on graph-Laplacian smoothness, which measures the coherence of physical quantities across the latent manifold. We demonstrate the efficacy of this framework on datasets of increasing complexity: surface reaction ODEs, Navier-Stokes wake flows, and experimental laser-induced combustion Schlieren images. The results show that our GM-VAE yields smooth, physically consistent manifolds and accurate regime clustering, offering a robust data-driven tool for interpreting turbulent and reactive flow systems.
LGOct 9, 2025
Multi-fidelity Batch Active Learning for Gaussian Process ClassifiersMurray Cutforth, Yiming Yang, Tiffany Fan et al.
Many science and engineering problems rely on expensive computational simulations, where a multi-fidelity approach can accelerate the exploration of a parameter space. We study efficient allocation of a simulation budget using a Gaussian Process (GP) model in the binary simulation output case. This paper introduces Bernoulli Parameter Mutual Information (BPMI), a batch active learning algorithm for multi-fidelity GP classifiers. BPMI circumvents the intractability of calculating mutual information in the probability space by employing a first-order Taylor expansion of the link function. We evaluate BPMI against several baselines on two synthetic test cases and a complex, real-world application involving the simulation of a laser-ignited rocket combustor. In all experiments, BPMI demonstrates superior performance, achieving higher predictive accuracy for a fixed computational budget.
CEAug 6, 2025
Convolutional autoencoders for the reconstruction of three-dimensional interfacial multiphase flowsMurray Cutforth, Shahab Mirjalili
In this work, we perform a comprehensive investigation of autoencoders for reduced-order modeling of three-dimensional multiphase flows. Focusing on the accuracy of reconstructing multiphase flow volume/mass fractions with a standard convolutional architecture, we examine the advantages and disadvantages of different interface representation choices (diffuse, sharp, level set). We use a combination of synthetic data with non-trivial interface topologies and high-resolution simulation data of multiphase homogeneous isotropic turbulence for training and validation. This study clarifies the best practices for reducing the dimensionality of multiphase flows via autoencoders. Consequently, this paves the path for uncoupling the training of autoencoders for accurate reconstruction and the training of temporal or input/output models such as neural operators (e.g., FNOs, DeepONets) and neural ODEs on the lower-dimensional latent space given by the autoencoders. As such, the implications of this study are significant and of interest to the multiphase flow community and beyond.