Tripti Agarwal

DC
h-index2
4papers
Novelty66%
AI Score42

4 Papers

26.1DCMar 27
Fast Topology-Aware Lossy Data Compression with Full Preservation of Critical Points and Local Order

Alex Fallin, Nathaniel Gorski, Tripti Agarwal et al.

Many scientific codes and instruments generate large amounts of floating-point data at high rates that must be compressed before they can be stored. Typically, only lossy compression algorithms deliver high-enough compression ratios. However, many of them provide only point-wise error bounds and do not preserve topological aspects of the data such as the relative magnitude of neighboring points. Even topology-preserving compressors tend to merely preserve some critical points and are generally slow. Our Local-Order-Preserving Compressor is the first to preserve the full local order (and thus all critical points), runs orders of magnitude faster than prior topology-preserving compressors, yields higher compression ratios than lossless compressors, and produces bit-for-bit the same output on CPUs and GPUs.

58.7DBMar 26
Enabling Homomorphic Analytical Operations on Compressed Scientific Data with Multi-stage Decompression

Xuan Wu, Sheng Di, Tripti Agarwal et al.

Error-controlled lossy compressors have been widely used in scientific applications to reduce the unprecedented size of scientific data while keeping data distortion within a user-specified threshold. While they significantly mitigate the pressure for data storage and transmission, they prolong the time to access the data because decompression is required to transform the binary compressed data into meaningful floating-point numbers. This incurs noticeable overhead for common analytical operations on scientific data that extract or derive useful information, because the time cost of the operations could be much lower than that of decompression. In this work, we design an error-controlled lossy compression and analytical framework that features multi-stage decompression and homomorphic analytical operation algorithms on intermediate decompressed data for reduced data access latency. Our contributions are threefold. (1) We abstract a generic compression pipeline with partial decompression to multiple intermediate data representations and implement four instances based on state-of-the-art high-throughput scientific data compressors. (2) We carefully design homomorphic algorithms to enable direct operations on intermediate decompressed data for three types of analytical operations on scientific data. (3) We evaluate our approach using five real-world scientific datasets. Experimental evaluations demonstrate that our method achieves significant speedups when performing analytical operations on compressed scientific data across all three targeted analytical operation types.

LGApr 5, 2025
Directional Sign Loss: A Topology-Preserving Loss Function that Approximates the Sign of Finite Differences

Harvey Dam, Tripti Agarwal, Ganesh Gopalakrishnan

Preserving topological features in learned latent spaces is a fundamental challenge in representation learning, particularly for topology-sensitive data. This paper introduces directional sign loss (DSL), an efficient, differentiable loss function that approximates the number of mismatches in the signs of finite differences between corresponding elements of two arrays. By penalizing discrepancies in critical points between input and reconstructed data, DSL encourages autoencoders and other learnable compressors to retain the topological features of the original data. We present the formulation and complexity analysis of DSL, comparing it to other non-differentiable topological measures. Experiments on multidimensional array data show that combining DSL with traditional loss functions preserves topological features more effectively than traditional losses alone. DSL serves as a differentiable, efficient proxy for common topology-based metrics, enabling topological feature preservation on previously impractical problem sizes and in a wider range of gradient-based optimization frameworks.

DCJun 17, 2024
What Operations can be Performed Directly on Compressed Arrays, and with What Error?

Tripti Agarwal, Harvey Dam, Dorra Ben Khalifa et al.

In response to the rapidly escalating costs of computing with large matrices and tensors caused by data movement, several lossy compression methods have been developed to significantly reduce data volumes. Unfortunately, all these methods require the data to be decompressed before further computations are done. In this work, we develop a lossy compressor that allows a dozen fairly fundamental operations directly on compressed data while offering good compression ratios and modest errors. We implement a new compressor PyBlaz based on the familiar GPU-powered PyTorch framework, and evaluate it on three non-trivial applications, choosing different number systems for internal representation. Our results demonstrate that the compressed-domain operations achieve good scalability with problem sizes while incurring errors well within acceptable limits. To our best knowledge, this is the first such lossy compressor that supports compressed-domain operations while achieving acceptable performance as well as error.