AO-PHMay 19
Improving Ensemble CAPE Forecasts with a Diffusion Model Incorporating Aerosol InformationZachary James, Joseph Guinness, Arthur DeGaetano
Convective available potential energy (CAPE) is an important variable for forecasting severe weather and understanding deep convection and precipitation. The latest versions of the Global Forecast System (GFS) and related Global Ensemble Forecast System (GEFS) have exhibited a bias towards underestimating CAPE values during the summertime. We train an artificial intelligence (AI) diffusion model to improve the skill and uncertainty quantification of afternoon 6-hour lead time ensemble forecasts over the United States. Our model takes a GFS CAPE forecast as input and outputs an ensemble that significantly outperforms both GFS and GEFS 6-hour forecasts on root mean square error, continuous ranked probability score, and Brier score. We propose a two-stage training pipeline to leverage both a larger historical GFS forecast dataset and a smaller historical GEFS dataset, despite the two using initialization and parameterization schemes that vary over time. We also show that classifier-free guidance can be used to control the skill and spread of the forecasts. We then demonstrate the versatility of our framework by adding aerosol optical depths (AODs) of black carbon, organic carbon, dust, sea salt, and sulfates as additional input features. Aerosols can invigorate or suppress convection depending on atmospheric conditions. Our AI models effectively incorporate aerosols to produce improved CAPE forecasts. We interpret the model components by using permutation feature importance to rank the influence of the different AODs and find that black carbon, organic carbon, and sulfate aerosols have a greater impact on the model's CAPE predictions than sea salt and dust aerosols.
COJul 3, 2024
Implementation and Analysis of GPU Algorithms for Vecchia ApproximationZachary James, Joseph Guinness
Gaussian Processes have become an indispensable part of the spatial statistician's toolbox but are unsuitable for analyzing large dataset because of the significant time and memory needed to fit the associated model exactly. Vecchia Approximation is widely used to reduce the computational complexity and can be calculated with embarrassingly parallel algorithms. While multi-core software has been developed for Vecchia Approximation, such as the GpGp R package, software designed to run on graphics processing units (GPU) is lacking, despite the tremendous success GPUs have had in statistics and machine learning. We compare three different ways to implement Vecchia Approximation on a GPU: two of which are similar to methods used for other Gaussian Process approximations and one that is new. The impact of memory type on performance is investigated and the final method is optimized accordingly. We show that our new method outperforms the other two and then present it in the GpGpU R package. We compare GpGpU to existing multi-core and GPU-accelerated software by fitting Gaussian Process models on various datasets, including a large spatial-temporal dataset of $n>10^6$ points collected from an earth-observing satellite. Our results show that GpGpU achieves faster runtimes and better predictive accuracy.
LGJul 16, 2025
A Framework for Nonstationary Gaussian Processes with Neural Network ParametersZachary James, Joseph Guinness
Gaussian processes have become a popular tool for nonparametric regression because of their flexibility and uncertainty quantification. However, they often use stationary kernels, which limit the expressiveness of the model and may be unsuitable for many datasets. We propose a framework that uses nonstationary kernels whose parameters vary across the feature space, modeling these parameters as the output of a neural network that takes the features as input. The neural network and Gaussian process are trained jointly using the chain rule to calculate derivatives. Our method clearly describes the behavior of the nonstationary parameters and is compatible with approximation methods for scaling to large datasets. It is flexible and easily adapts to different nonstationary kernels without needing to redesign the optimization procedure. Our methods are implemented with the GPyTorch library and can be readily modified. We test a nonstationary variance and noise variant of our method on several machine learning datasets and find that it achieves better accuracy and log-score than both a stationary model and a hierarchical model approximated with variational inference. Similar results are observed for a model with only nonstationary variance. We also demonstrate our approach's ability to recover the nonstationary parameters of a spatial dataset.
MEFeb 25, 2022
Scalable Gaussian-process regression and variable selection using Vecchia approximationsJian Cao, Joseph Guinness, Marc G. Genton et al.
Gaussian process (GP) regression is a flexible, nonparametric approach to regression that naturally quantifies uncertainty. In many applications, the number of responses and covariates are both large, and a goal is to select covariates that are related to the response. For this setting, we propose a novel, scalable algorithm, coined VGPR, which optimizes a penalized GP log-likelihood based on the Vecchia GP approximation, an ordered conditional approximation from spatial statistics that implies a sparse Cholesky factor of the precision matrix. We traverse the regularization path from strong to weak penalization, sequentially adding candidate covariates based on the gradient of the log-likelihood and deselecting irrelevant covariates via a new quadratic constrained coordinate descent algorithm. We propose Vecchia-based mini-batch subsampling, which provides unbiased gradient estimators. The resulting procedure is scalable to millions of responses and thousands of covariates. Theoretical analysis and numerical studies demonstrate the improved scalability and accuracy relative to existing methods.
MEMay 1, 2020
Scaled Vecchia approximation for fast computer-model emulationMatthias Katzfuss, Joseph Guinness, Earl Lawrence
Many scientific phenomena are studied using computer experiments consisting of multiple runs of a computer model while varying the input settings. Gaussian processes (GPs) are a popular tool for the analysis of computer experiments, enabling interpolation between input settings, but direct GP inference is computationally infeasible for large datasets. We adapt and extend a powerful class of GP methods from spatial statistics to enable the scalable analysis and emulation of large computer experiments. Specifically, we apply Vecchia's ordered conditional approximation in a transformed input space, with each input scaled according to how strongly it relates to the computer-model response. The scaling is learned from the data, by estimating parameters in the GP covariance function using Fisher scoring. Our methods are highly scalable, enabling estimation, joint prediction and simulation in near-linear time in the number of model runs. In several numerical examples, our approach substantially outperformed existing methods.
CODec 26, 2019
Inverses of Matern Covariances on GridsJoseph Guinness
We conduct a study of the aliased spectral densities of Matérn covariance functions on a regular grid of points, providing clarity on the properties of a popular approximation based on stochastic partial differential equations; while others have shown that it can approximate the covariance function well, we find that it assigns too much power at high frequencies and does not provide increasingly accurate approximations to the inverse as the grid spacing goes to zero, except in the one-dimensional exponential covariance case. We provide numerical results to support our theory, and in a simulation study, we investigate the implications for parameter estimation, finding that the SPDE approximation tends to overestimate spatial range parameters.
COMay 20, 2019
Gaussian Process Learning via Fisher Scoring of Vecchia's ApproximationJoseph Guinness
We derive a single pass algorithm for computing the gradient and Fisher information of Vecchia's Gaussian process loglikelihood approximation, which provides a computationally efficient means for applying the Fisher scoring algorithm for maximizing the loglikelihood. The advantages of the optimization techniques are demonstrated in numerical examples and in an application to Argo ocean temperature data. The new methods are more accurate and much faster than an optimization method that uses only function evaluations, especially when the covariance function has many parameters. This allows practitioners to fit nonstationary models to large spatial and spatial-temporal datasets.