Kate Duffy

LG
4papers
77citations
Novelty53%
AI Score28

4 Papers

LGJul 16, 2024
Global atmospheric data assimilation with multi-modal masked autoencoders

Thomas J. Vandal, Kate Duffy, Daniel McDuff et al.

Global data assimilation enables weather forecasting at all scales and provides valuable data for studying the Earth system. However, the computational demands of physics-based algorithms used in operational systems limits the volume and diversity of observations that are assimilated. Here, we present "EarthNet", a multi-modal foundation model for data assimilation that learns to predict a global gap-filled atmospheric state solely from satellite observations. EarthNet is trained as a masked autoencoder that ingests a 12 hour sequence of observations and learns to fill missing data from other sensors. We show that EarthNet performs a form of data assimilation producing a global 0.16 degree reanalysis dataset of 3D atmospheric temperature and humidity at a fraction of the time compared to operational systems. It is shown that the resulting reanalysis dataset reproduces climatology by evaluating a 1 hour forecast background state against observations. We also show that our 3D humidity predictions outperform MERRA-2 and ERA5 reanalyses by 10% to 60% between the middle troposphere and lower stratosphere (5 to 20 km altitude) and our 3D temperature and humidity are statistically equivalent to the Microwave integrated Retrieval System (MiRS) observations at nearly every level of the atmosphere. Our results indicate significant promise in using EarthNet for high-frequency data assimilation and global weather forecasting.

AO-PHJan 7, 2022
Explainable deep learning for insights in El Niño and river flows

Yumin Liu, Kate Duffy, Jennifer G. Dy et al.

The El Niño Southern Oscillation (ENSO) is a semi-periodic fluctuation in sea surface temperature (SST) over the tropical central and eastern Pacific Ocean that influences interannual variability in regional hydrology across the world through long-range dependence or teleconnections. Recent research has demonstrated the value of Deep Learning (DL) methods for improving ENSO prediction as well as Complex Networks (CN) for understanding teleconnections. However, gaps in predictive understanding of ENSO-driven river flows include the black box nature of DL, the use of simple ENSO indices to describe a complex phenomenon and translating DL-based ENSO predictions to river flow predictions. Here we show that eXplainable DL (XDL) methods, based on saliency maps, can extract interpretable predictive information contained in global SST and discover SST information regions and dependence structures relevant for river flows which, in tandem with climate network constructions, enable improved predictive understanding. Our results reveal additional information content in global SST beyond ENSO indices, develop understanding of how SSTs influence river flows, and generate improved river flow prediction, including uncertainty estimation. Observations, reanalysis data, and earth system model simulations are used to demonstrate the value of the XDL-CN based methods for future interannual and decadal scale climate projections.

LGOct 29, 2019
A framework for deep learning emulation of numerical models with a case study in satellite remote sensing

Kate Duffy, Thomas Vandal, Weile Wang et al.

Numerical models based on physics represent the state-of-the-art in earth system modeling and comprise our best tools for generating insights and predictions. Despite rapid growth in computational power, the perceived need for higher model resolutions overwhelms the latest-generation computers, reducing the ability of modelers to generate simulations for understanding parameter sensitivities and characterizing variability and uncertainty. Thus, surrogate models are often developed to capture the essential attributes of the full-blown numerical models. Recent successes of machine learning methods, especially deep learning, across many disciplines offer the possibility that complex nonlinear connectionist representations may be able to capture the underlying complex structures and nonlinear processes in earth systems. A difficult test for deep learning-based emulation, which refers to function approximation of numerical models, is to understand whether they can be comparable to traditional forms of surrogate models in terms of computational efficiency while simultaneously reproducing model results in a credible manner. A deep learning emulation that passes this test may be expected to perform even better than simple models with respect to capturing complex processes and spatiotemporal dependencies. Here we examine, with a case study in satellite-based remote sensing, the hypothesis that deep learning approaches can credibly represent the simulations from a surrogate model with comparable computational efficiency. Our results are encouraging in that the deep learning emulation reproduces the results with acceptable accuracy and often even faster performance. We discuss the broader implications of our results in light of the pace of improvements in high-performance implementations of deep learning as well as the growing desire for higher-resolution simulations in the earth sciences.

CVFeb 12, 2019
Progressively Growing Generative Adversarial Networks for High Resolution Semantic Segmentation of Satellite Images

Edward Collier, Kate Duffy, Sangram Ganguly et al.

Machine learning has proven to be useful in classification and segmentation of images. In this paper, we evaluate a training methodology for pixel-wise segmentation on high resolution satellite images using progressive growing of generative adversarial networks. We apply our model to segmenting building rooftops and compare these results to conventional methods for rooftop segmentation. We present our findings using the SpaceNet version 2 dataset. Progressive GAN training achieved a test accuracy of 93% compared to 89% for traditional GAN training.