LGMay 27, 2022
Targeted Adaptive DesignCarlo Graziani, Marieme Ngom
Modern advanced manufacturing and advanced materials design often require searches of relatively high-dimensional process control parameter spaces for settings that result in optimal structure, property, and performance parameters. The mapping from the former to the latter must be determined from noisy experiments or from expensive simulations. We abstract this problem to a mathematical framework in which an unknown function from a control space to a design space must be ascertained by means of expensive noisy measurements, which locate optimal control settings generating desired design features within specified tolerances, with quantified uncertainty. We describe targeted adaptive design (TAD), a new algorithm that performs this sampling task efficiently. TAD creates a Gaussian process surrogate model of the unknown mapping at each iterative stage, proposing a new batch of control settings to sample experimentally and optimizing the updated log-predictive likelihood of the target design. TAD either stops upon locating a solution with uncertainties that fit inside the tolerance box or uses a measure of expected future information to determine that the search space has been exhausted with no solution. TAD thus embodies the exploration-exploitation tension in a manner that recalls, but is essentially different from, Bayesian optimization and optimal experimental design.
MLMar 23, 2022
A Deep Learning Approach to Probabilistic Forecasting of WeatherNick Rittler, Carlo Graziani, Jiali Wang et al.
We discuss an approach to probabilistic forecasting based on two chained machine-learning steps: a dimensional reduction step that learns a reduction map of predictor information to a low-dimensional space in a manner designed to preserve information about forecast quantities; and a density estimation step that uses the probabilistic machine learning technique of normalizing flows to compute the joint probability density of reduced predictors and forecast quantities. This joint density is then renormalized to produce the conditional forecast distribution. In this method, probabilistic calibration testing plays the role of a regularization procedure, preventing overfitting in the second step, while effective dimensional reduction from the first step is the source of forecast sharpness. We verify the method using a 22-year 1-hour cadence time series of Weather Research and Forecasting (WRF) simulation data of surface wind on a grid.
23.9CLMay 20
Probabilistic Attribution For Large Language ModelsShilpika Shilpika, Carlo Graziani, Bethany Lusch et al.
The generative nature of Large Language Models (LLMs) is reflected in the conditional probabilities they compute to sample each response token given the previous tokens. These probabilities encode the distributional structure that the model learns in training and exploits in inference. In this work, we use these probabilities to situate LLMs within the mathematical theory of stochastic processes. We use this framework to design a model-agnostic probabilistic token attribution measure, using Bayes rule to invert the next-token log-probabilities so as to capture the models internal representation of the distribution over token sequences. The representation is independent of the models computational structure. This representation yields the conditional probability of the response given the prompt, and of the response given the prompt with a token marginalized away. Our attribution score is the log of the ratio of these probabilities. We further compute the entropies of a single prompts token distributions, conditioned on the remaining context. The interplay between entropy and attribution score sheds light on LLM behavior. We evaluate 8 models across 7 prompts and investigate anomalies, token sensitivity, response stability, model stability, and training convergence, thereby improving interpretability and guiding users to focus on uncertain or unstable parts of the generation.
DSSep 26, 2023
OS-net: Orbitally Stable Neural NetworksMarieme Ngom, Carlo Graziani
We introduce OS-net (Orbitally Stable neural NETworks), a new family of neural network architectures specifically designed for periodic dynamical data. OS-net is a special case of Neural Ordinary Differential Equations (NODEs) and takes full advantage of the adjoint method based backpropagation method. Utilizing ODE theory, we derive conditions on the network weights to ensure stability of the resulting dynamics. We demonstrate the efficacy of our approach by applying OS-net to discover the dynamics underlying the Rössler and Sprott's systems, two dynamical systems known for their period doubling attractors and chaotic behavior.
MESep 17, 2025
Kernel Model Validation: How To Do It, And Why You Should CareCarlo Graziani, Marieme Ngom
Gaussian Process (GP) models are popular tools in uncertainty quantification (UQ) because they purport to furnish functional uncertainty estimates that can be used to represent model uncertainty. It is often difficult to state with precision what probabilistic interpretation attaches to such an uncertainty, and in what way is it calibrated. Without such a calibration statement, the value of such uncertainty estimates is quite limited and qualitative. We motivate the importance of proper probabilistic calibration of GP predictions by describing how GP predictive calibration failures can cause degraded convergence properties in a target optimization algorithm called Targeted Adaptive Design (TAD). We discuss the interpretation of GP-generated uncertainty intervals in UQ, and how one may learn to trust them, through a formal procedure for covariance kernel validation that exploits the multivariate normal nature of GP predictions. We give simple examples of GP regression misspecified 1-dimensional models, and discuss the situation with respect to higher-dimensional models.
APMar 29, 2021
Simultaneous Reconstruction and Uncertainty Quantification for TomographyAgnimitra Dasgupta, Carlo Graziani, Zichao Wendy Di
Tomographic reconstruction, despite its revolutionary impact on a wide range of applications, suffers from its ill-posed nature in that there is no unique solution because of limited and noisy measurements. Therefore, in the absence of ground truth, quantifying the solution quality is highly desirable but under-explored. In this work, we address this challenge through Gaussian process modeling to flexibly and explicitly incorporate prior knowledge of sample features and experimental noises through the choices of the kernels and noise models. Our proposed method yields not only comparable reconstruction to existing practical reconstruction methods (e.g., regularized iterative solver for inverse problem) but also an efficient way of quantifying solution uncertainties. We demonstrate the capabilities of the proposed approach on various images and show its unique capability of uncertainty quantification in the presence of various noises.