LGDec 22, 2024

Foundation Model for Lossy Compression of Spatiotemporal Scientific Data

Xiao Li, Jaemoon Lee, Anand Rangarajan, Sanjay Ranka

arXiv:2412.17184v16.42 citationsh-index: 14PAKDD

Originality Incremental advance

AI Analysis

This work addresses storage and transmission costs for large-scale scientific simulations, representing an incremental advancement by integrating existing techniques like VAEs and super-resolution into a novel framework.

The paper tackles lossy compression of spatiotemporal scientific data by proposing a foundation model that combines a VAE with hyper-priors and a super-resolution module, achieving up to 4 times higher compression ratios than state-of-the-art methods and a 30% improvement over simple upsampling.

We present a foundation model (FM) for lossy scientific data compression, combining a variational autoencoder (VAE) with a hyper-prior structure and a super-resolution (SR) module. The VAE framework uses hyper-priors to model latent space dependencies, enhancing compression efficiency. The SR module refines low-resolution representations into high-resolution outputs, improving reconstruction quality. By alternating between 2D and 3D convolutions, the model efficiently captures spatiotemporal correlations in scientific data while maintaining low computational cost. Experimental results demonstrate that the FM generalizes well to unseen domains and varying data shapes, achieving up to 4 times higher compression ratios than state-of-the-art methods after domain-specific fine-tuning. The SR module improves compression ratio by 30 percent compared to simple upsampling techniques. This approach significantly reduces storage and transmission costs for large-scale scientific simulations while preserving data integrity and fidelity.

View on arXiv PDF

Similar