DC AIApr 20, 2024

GWLZ: A Group-wise Learning-based Lossy Compression Framework for Scientific Data

Wenqi Jia, Sian Jin, Jinzhen Wang, Wei Niu, Dingwen Tao, Miao Yin

arXiv:2404.13470v13.37 citationsh-index: 31FlexScience@HPDC

Originality Incremental advance

AI Analysis

This addresses data management challenges for HPC systems by improving compression quality, though it is incremental as it builds on existing error-bounded methods with deep learning enhancements.

The paper tackles the problem of limited reconstruction quality in error-bounded lossy compression for exascale scientific data by proposing GWLZ, a group-wise learning-based framework, which achieves up to 20% quality enhancements with negligible overhead as low as 0.0003x.

The rapid expansion of computational capabilities and the ever-growing scale of modern HPC systems present formidable challenges in managing exascale scientific data. Faced with such vast datasets, traditional lossless compression techniques prove insufficient in reducing data size to a manageable level while preserving all information intact. In response, researchers have turned to error-bounded lossy compression methods, which offer a balance between data size reduction and information retention. However, despite their utility, these compressors employing conventional techniques struggle with limited reconstruction quality. To address this issue, we draw inspiration from recent advancements in deep learning and propose GWLZ, a novel group-wise learning-based lossy compression framework with multiple lightweight learnable enhancer models. Leveraging a group of neural networks, GWLZ significantly enhances the decompressed data reconstruction quality with negligible impact on the compression efficiency. Experimental results on different fields from the Nyx dataset demonstrate remarkable improvements by GWLZ, achieving up to 20% quality enhancements with negligible overhead as low as 0.0003x.

View on arXiv PDF

Similar