LG AO-PHMay 1, 2024

Machine Learning Techniques for Data Reduction of Climate Applications

Xiao Li, Qian Gong, Jaemoon Lee, Scott Klasky, Anand Rangarajan, Sanjay Ranka

arXiv:2405.00879v16.43 citationsh-index: 45PAKDD

Originality Incremental advance

AI Analysis

This work addresses data storage and processing challenges for climate scientists, but it is incremental as it builds on existing compression and neural network methods.

The paper tackles the problem of reducing large-scale climate simulation data without compromising derived quantities-of-interest (QoI), such as tropical cyclone detection, by using a pipelined compression approach that combines neural networks and a Guaranteed Autoencoder to achieve high compression ratios while maintaining data integrity.

Scientists conduct large-scale simulations to compute derived quantities-of-interest (QoI) from primary data. Often, QoI are linked to specific features, regions, or time intervals, such that data can be adaptively reduced without compromising the integrity of QoI. For many spatiotemporal applications, these QoI are binary in nature and represent presence or absence of a physical phenomenon. We present a pipelined compression approach that first uses neural-network-based techniques to derive regions where QoI are highly likely to be present. Then, we employ a Guaranteed Autoencoder (GAE) to compress data with differential error bounds. GAE uses QoI information to apply low-error compression to only these regions. This results in overall high compression ratios while still achieving downstream goals of simulation or data collections. Experimental results are presented for climate data generated from the E3SM Simulation model for downstream quantities such as tropical cyclone and atmospheric river detection and tracking. These results show that our approach is superior to comparable methods in the literature.

View on arXiv PDF

Similar