Andrew Zammit‐Mangion

h-index22

6papers

230citations

Novelty31%

AI Score28

Ranked #151,718 of 194,257 authors (top 78%)#2,493 in ML (top 74%)

6 Papers

15.6MLJun 5, 2022

Statistical Deep Learning for Spatial and Spatio-Temporal Data

Christopher K. Wikle, Andrew Zammit-Mangion

Deep neural network models have become ubiquitous in recent years, and have been applied to nearly all areas of science, engineering, and industry. These models are particularly useful for data that have strong dependencies in space (e.g., images) and time (e.g., sequences). Indeed, deep models have also been extensively used by the statistical community to model spatial and spatio-temporal data through, for example, the use of multi-level Bayesian hierarchical models and deep Gaussian processes. In this review, we first present an overview of traditional statistical and machine learning perspectives for modeling spatial and spatio-temporal data, and then focus on a variety of hybrid models that have recently been developed for latent process, data, and parameter specifications. These hybrid models integrate statistical modeling ideas with deep neural network models in order to take advantage of the strengths of each modeling paradigm. We conclude by giving an overview of computational technologies that have proven useful for these hybrid models, and with a brief discussion on future research directions.

4.3MLNov 16, 2023Code

Spatial Bayesian Neural Networks

Andrew Zammit-Mangion, Michael D. Kaminski, Ba-Hien Tran et al.

interpretable, and well understood models that are routinely employed even though, as is revealed through prior and posterior predictive checks, these can poorly characterise the spatial heterogeneity in the underlying process of interest. Here, we propose a new, flexible class of spatial-process models, which we refer to as spatial Bayesian neural networks (SBNNs). An SBNN leverages the representational capacity of a Bayesian neural network; it is tailored to a spatial setting by incorporating a spatial ``embedding layer'' into the network and, possibly, spatially-varying network parameters. An SBNN is calibrated by matching its finite-dimensional distribution at locations on a fine gridding of space to that of a target process of interest. That process could be easy to simulate from or we may have many realisations from it. We propose several variants of SBNNs, most of which are able to match the finite-dimensional distribution of the target process at the selected grid better than conventional BNNs of similar complexity. We also show that an SBNN can be used to represent a variety of spatial processes often used in practice, such as Gaussian processes, lognormal processes, and max-stable processes. We briefly discuss the tools that could be used to make inference with SBNNs, and we conclude with a discussion of their advantages and limitations.

24.9MLApr 18, 2024

Neural Methods for Amortized Inference

Andrew Zammit-Mangion, Matthew Sainsbury-Dale, Raphaël Huser

Simulation-based methods for statistical inference have evolved dramatically over the past 50 years, keeping pace with technological advancements. The field is undergoing a new revolution as it embraces the representational capacity of neural networks, optimization libraries and graphics processing units for learning complex mappings between data and inferential targets. The resulting tools are amortized, in the sense that, after an initial setup cost, they allow rapid inference through fast feed-forward operations. In this article we review recent progress in the context of point estimation, approximate Bayesian inference, summary-statistic construction, and likelihood approximation. We also cover software, and include a simple illustration to showcase the wide array of tools available for amortized inference and the benefits they offer over Markov chain Monte Carlo methods. The article concludes with an overview of relevant topics and an outlook on future research directions.

2.3MEJan 24, 2022Code

Spherical Poisson Point Process Intensity Function Modeling and Estimation with Measure Transport

Tin Lok James Ng, Andrew Zammit-Mangion

Recent years have seen an increased interest in the application of methods and techniques commonly associated with machine learning and artificial intelligence to spatial statistics. Here, in a celebration of the ten-year anniversary of the journal Spatial Statistics, we bring together normalizing flows, commonly used for density function estimation in machine learning, and spherical point processes, a topic of particular interest to the journal's readership, to present a new approach for modeling non-homogeneous Poisson process intensity functions on the sphere. The central idea of this framework is to build, and estimate, a flexible bijective map that transforms the underlying intensity function of interest on the sphere into a simpler, reference, intensity function, also on the sphere. Map estimation can be done efficiently using automatic differentiation and stochastic gradient descent, and uncertainty quantification can be done straightforwardly via nonparametric bootstrap. We investigate the viability of the proposed method in a simulation study, and illustrate its use in a proof-of-concept study where we model the intensity of cyclone events in the North Pacific Ocean. Our experiments reveal that normalizing flows present a flexible and straightforward way to model intensity functions on spheres, but that their potential to yield a good fit depends on the architecture of the bijective map, which can be difficult to establish in practice.

3.1LGDec 22, 2021Code

Emulation of greenhouse-gas sensitivities using variational autoencoders

Laura Cartwright, Andrew Zammit-Mangion, Nicholas M. Deutscher

Flux inversion is the process by which sources and sinks of a gas are identified from observations of gas mole fraction. The inversion often involves running a Lagrangian particle dispersion model (LPDM) to generate sensitivities between observations and fluxes over a spatial domain of interest. The LPDM must be run backward in time for every gas measurement, and this can be computationally prohibitive. To address this problem, here we develop a novel spatio-temporal emulator for LPDM sensitivities that is built using a convolutional variational autoencoder (CVAE). With the encoder segment of the CVAE, we obtain approximate (variational) posterior distributions over latent variables in a low-dimensional space. We then use a spatio-temporal Gaussian process emulator on the low-dimensional space to emulate new variables at prediction locations and time points. Emulated variables are then passed through the decoder segment of the CVAE to yield emulated sensitivities. We show that our CVAE-based emulator outperforms the more traditional emulator built using empirical orthogonal functions and that it can be used with different LPDMs. We conclude that our emulation-based approach can be used to reliably reduce the computing time needed to generate LPDM outputs for use in high-resolution flux inversions.

12.2MEJun 6, 2019

Deep Compositional Spatial Models

Andrew Zammit-Mangion, Tin Lok James Ng, Quan Vu et al.

Spatial processes with nonstationary and anisotropic covariance structure are often used when modelling, analysing and predicting complex environmental phenomena. Such processes may often be expressed as ones that have stationary and isotropic covariance structure on a warped spatial domain. However, the warping function is generally difficult to fit and not constrained to be injective, often resulting in `space-folding.' Here, we propose modelling an injective warping function through a composition of multiple elemental injective functions in a deep-learning framework. We consider two cases; first, when these functions are known up to some weights that need to be estimated, and, second, when the weights in each layer are random. Inspired by recent methodological and technological advances in deep learning and deep Gaussian processes, we employ approximate Bayesian methods to make inference with these models using graphics processing units. Through simulation studies in one and two dimensions we show that the deep compositional spatial models are quick to fit, and are able to provide better predictions and uncertainty quantification than other deep stochastic models of similar complexity. We also show their remarkable capacity to model nonstationary, anisotropic spatial data using radiances from the MODIS instrument aboard the Aqua satellite.