LGMay 8, 2025

GraphComp: Extreme Error-bounded Compression of Scientific Data via Temporal Graph Autoencoders

arXiv:2505.06316v11 citationsh-index: 31
Originality Highly original
AI Analysis

This addresses storage and transfer challenges for scientific data users, offering a novel approach that leverages spatial and temporal correlations for improved compression.

The paper tackles the problem of compressing voluminous scientific data by proposing GRAPHCOMP, a graph-based method using temporal graph autoencoders to achieve high compression ratios while ensuring error bounds, outperforming state-of-the-art methods by 22% to 50% in compression ratio.

The generation of voluminous scientific data poses significant challenges for efficient storage, transfer, and analysis. Recently, error-bounded lossy compression methods emerged due to their ability to achieve high compression ratios while controlling data distortion. However, they often overlook the inherent spatial and temporal correlations within scientific data, thus missing opportunities for higher compression. In this paper we propose GRAPHCOMP, a novel graph-based method for error-bounded lossy compression of scientific data. We perform irregular segmentation of the original grid data and generate a graph representation that preserves the spatial and temporal correlations. Inspired by Graph Neural Networks (GNNs), we then propose a temporal graph autoencoder to learn latent representations that significantly reduce the size of the graph, effectively compressing the original data. Decompression reverses the process and utilizes the learnt graph model together with the latent representation to reconstruct an approximation of the original data. The decompressed data are guaranteed to satisfy a user-defined point-wise error bound. We compare our method against the state-of-the-art error-bounded lossy methods (i.e., HPEZ, SZ3.1, SPERR, and ZFP) on large-scale real and synthetic data. GRAPHCOMP consistently achieves the highest compression ratio across most datasets, outperforming the second-best method by margins ranging from 22% to 50%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes