Raphaël Huser

ML
h-index18
5papers
104citations
Novelty42%
AI Score33

5 Papers

MLJul 16, 2023
Flexible and efficient emulation of spatial extremes processes via variational autoencoders

Likun Zhang, Xiaoyu Ma, Christopher K. Wikle et al.

Many real-world processes have complex tail dependence structures that cannot be characterized using classical Gaussian processes. More flexible spatial extremes models exhibit appealing extremal dependence properties but are often exceedingly prohibitive to fit and simulate from in high dimensions. In this paper, we aim to push the boundaries on computation and modeling of high-dimensional spatial extremes via integrating a new spatial extremes model that has flexible and non-stationary dependence properties in the encoding-decoding structure of a variational autoencoder called the XVAE. The XVAE can emulate spatial observations and produce outputs that have the same statistical properties as the inputs, especially in the tail. Our approach also provides a novel way of making fast inference with complex extreme-value processes. Through extensive simulation studies, we show that our XVAE is substantially more time-efficient than traditional Bayesian inference while outperforming many spatial extremes models with a stationary dependence structure. Lastly, we analyze a high-resolution satellite-derived dataset of sea surface temperature in the Red Sea, which includes 30 years of daily measurements at 16703 grid cells. We demonstrate how to use XVAE to identify regions susceptible to marine heatwaves under climate change and examine the spatial and temporal variability of the extremal dependence structure.

MLAug 16, 2022
Regression modelling of spatiotemporal extreme U.S. wildfires via partially-interpretable neural networks

Jordan Richards, Raphaël Huser

Risk management in many environmental settings requires an understanding of the mechanisms that drive extreme events. Useful metrics for quantifying such risk are extreme quantiles of response variables conditioned on predictor variables that describe, e.g., climate, biosphere and environmental states. Typically these quantiles lie outside the range of observable data and so, for estimation, require specification of parametric extreme value models within a regression framework. Classical approaches in this context utilise linear or additive relationships between predictor and response variables and suffer in either their predictive capabilities or computational efficiency; moreover, their simplicity is unlikely to capture the truly complex structures that lead to the creation of extreme wildfires. In this paper, we propose a new methodological framework for performing extreme quantile regression using artificial neutral networks, which are able to capture complex non-linear relationships and scale well to high-dimensional data. The "black box" nature of neural networks means that they lack the desirable trait of interpretability often favoured by practitioners; thus, we unify linear, and additive, regression methodology with deep learning to create partially-interpretable neural networks that can be used for statistical inference but retain high prediction accuracy. To complement this methodology, we further propose a novel point process model for extreme values which overcomes the finite lower-endpoint problem associated with the generalised extreme value class of distributions. Efficacy of our unified framework is illustrated on U.S. wildfire data with a high-dimensional predictor set and we illustrate vast improvements in predictive performance over linear and spline-based regression techniques.

MLApr 18, 2024
Neural Methods for Amortized Inference

Andrew Zammit-Mangion, Matthew Sainsbury-Dale, Raphaël Huser

Simulation-based methods for statistical inference have evolved dramatically over the past 50 years, keeping pace with technological advancements. The field is undergoing a new revolution as it embraces the representational capacity of neural networks, optimization libraries and graphics processing units for learning complex mappings between data and inferential targets. The resulting tools are amortized, in the sense that, after an initial setup cost, they allow rapid inference through fast feed-forward operations. In this article we review recent progress in the context of point estimation, approximate Bayesian inference, summary-statistic construction, and likelihood approximation. We also cover software, and include a simple illustration to showcase the wide array of tools available for amortized inference and the benefits they offer over Markov chain Monte Carlo methods. The article concludes with an overview of relevant topics and an outlook on future research directions.

MLOct 5, 2025
Simulation-based inference via telescoping ratio estimation for trawl processes

Dan Leonte, Raphaël Huser, Almut E. D. Veraart

The growing availability of large and complex datasets has increased interest in temporal stochastic processes that can capture stylized facts such as marginal skewness, non-Gaussian tails, long memory, and even non-Markovian dynamics. While such models are often easy to simulate from, parameter estimation remains challenging. Simulation-based inference (SBI) offers a promising way forward, but existing methods typically require large training datasets or complex architectures and frequently yield confidence (credible) regions that fail to attain their nominal values, raising doubts on the reliability of estimates for the very features that motivate the use of these models. To address these challenges, we propose a fast and accurate, sample-efficient SBI framework for amortized posterior inference applicable to intractable stochastic processes. The proposed approach relies on two main steps: first, we learn the posterior density by decomposing it sequentially across parameter dimensions. Then, we use Chebyshev polynomial approximations to efficiently generate independent posterior samples, enabling accurate inference even when Markov chain Monte Carlo methods mix poorly. We further develop novel diagnostic tools for SBI in this context, as well as post-hoc calibration techniques; the latter not only lead to performance improvements of the learned inferential tool, but also to the ability to reuse it directly with new time series of varying lengths, thus amortizing the training cost. We demonstrate the method's effectiveness on trawl processes, a class of flexible infinitely divisible models that generalize univariate Gaussian processes, applied to energy demand data.

LGJan 25, 2024
At the junction between deep learning and statistics of extremes: formalizing the landslide hazard definition

Ashok Dahal, Raphaël Huser, Luigi Lombardo

The most adopted definition of landslide hazard combines spatial information about landslide location (susceptibility), threat (intensity), and frequency (return period). Only the first two elements are usually considered and estimated when working over vast areas. Even then, separate models constitute the standard, with frequency being rarely investigated. Frequency and intensity are intertwined and depend on each other because larger events occur less frequently and vice versa. However, due to the lack of multi-temporal inventories and joint statistical models, modelling such properties via a unified hazard model has always been challenging and has yet to be attempted. Here, we develop a unified model to estimate landslide hazard at the slope unit level to address such gaps. We employed deep learning, combined with a model motivated by extreme-value theory to analyse an inventory of 30 years of observed rainfall-triggered landslides in Nepal and assess landslide hazard for multiple return periods. We also use our model to further explore landslide hazard for the same return periods under different climate change scenarios up to the end of the century. Our results show that the proposed model performs excellently and can be used to model landslide hazard in a unified manner. Geomorphologically, we find that under both climate change scenarios (SSP245 and SSP885), landslide hazard is likely to increase up to two times on average in the lower Himalayan regions while remaining the same in the middle Himalayan region whilst decreasing slightly in the upper Himalayan region areas.