LGOct 6, 2022
Probabilistic partition of unity networks for high-dimensional regression problemsTiffany Fan, Nathaniel Trask, Marta D'Elia et al. · stanford
We explore the probabilistic partition of unity network (PPOU-Net) model in the context of high-dimensional regression problems and propose a general framework focusing on adaptive dimensionality reduction. With the proposed framework, the target function is approximated by a mixture of experts model on a low-dimensional manifold, where each cluster is associated with a local fixed-degree polynomial. We present a training strategy that leverages the expectation maximization (EM) algorithm. During the training, we alternate between (i) applying gradient descent to update the DNN coefficients; and (ii) using closed-form formulae derived from the EM algorithm to update the mixture of experts model parameters. Under the probabilistic formulation, step (ii) admits the form of embarrassingly parallelizable weighted least-squares solves. The PPOU-Nets consistently outperform the baseline fully-connected neural networks of comparable sizes in numerical experiments of various data dimensions. We also explore the proposed model in applications of quantum computing, where the PPOU-Nets act as surrogate models for cost landscapes associated with variational quantum circuits.
CEJan 31, 2025
Physically Interpretable Representation and Controlled Generation for Turbulence DataTiffany Fan, Murray Cutforth, Marta D'Elia et al.
Computational Fluid Dynamics (CFD) plays a pivotal role in fluid mechanics, enabling precise simulations of fluid behavior through partial differential equations (PDEs). However, traditional CFD methods are resource-intensive, particularly for high-fidelity simulations of complex flows, which are further complicated by high dimensionality, inherent stochasticity, and limited data availability. This paper addresses these challenges by proposing a data-driven approach that leverages a Gaussian Mixture Variational Autoencoder (GMVAE) to encode high-dimensional scientific data into low-dimensional, physically meaningful representations. The GMVAE learns a structured latent space where data can be categorized based on physical properties such as the Reynolds number while maintaining global physical consistency. To assess the interpretability of the learned representations, we introduce a novel metric based on graph spectral theory, quantifying the smoothness of physical quantities along the latent manifold. We validate our approach using 2D Navier-Stokes simulations of flow past a cylinder over a range of Reynolds numbers. Our results demonstrate that the GMVAE provides improved clustering, meaningful latent structure, and robust generative capabilities compared to baseline dimensionality reduction methods. This framework offers a promising direction for data-driven turbulence modeling and broader applications in computational fluid dynamics and engineering systems.
LGNov 26, 2025
Physically Interpretable Representation Learning with Gaussian Mixture Variational AutoEncoder (GM-VAE)Tiffany Fan, Murray Cutforth, Marta D'Elia et al.
Extracting compact, physically interpretable representations from high-dimensional scientific data is a persistent challenge due to the complex, nonlinear structures inherent in physical systems. We propose a Gaussian Mixture Variational Autoencoder (GM-VAE) framework designed to address this by integrating an Expectation-Maximization (EM)-inspired training scheme with a novel spectral interpretability metric. Unlike conventional VAEs that jointly optimize reconstruction and clustering (often leading to training instability), our method utilizes a block-coordinate descent strategy, alternating between expectation and maximization steps. This approach stabilizes training and naturally aligns latent clusters with distinct physical regimes. To objectively evaluate the learned representations, we introduce a quantitative metric based on graph-Laplacian smoothness, which measures the coherence of physical quantities across the latent manifold. We demonstrate the efficacy of this framework on datasets of increasing complexity: surface reaction ODEs, Navier-Stokes wake flows, and experimental laser-induced combustion Schlieren images. The results show that our GM-VAE yields smooth, physically consistent manifolds and accurate regime clustering, offering a robust data-driven tool for interpreting turbulent and reactive flow systems.
LGOct 9, 2025
Multi-fidelity Batch Active Learning for Gaussian Process ClassifiersMurray Cutforth, Yiming Yang, Tiffany Fan et al.
Many science and engineering problems rely on expensive computational simulations, where a multi-fidelity approach can accelerate the exploration of a parameter space. We study efficient allocation of a simulation budget using a Gaussian Process (GP) model in the binary simulation output case. This paper introduces Bernoulli Parameter Mutual Information (BPMI), a batch active learning algorithm for multi-fidelity GP classifiers. BPMI circumvents the intractability of calculating mutual information in the probability space by employing a first-order Taylor expansion of the link function. We evaluate BPMI against several baselines on two synthetic test cases and a complex, real-world application involving the simulation of a laser-ignited rocket combustor. In all experiments, BPMI demonstrates superior performance, achieving higher predictive accuracy for a fixed computational budget.