Danny Harnik

CR
h-index19
4papers
59citations
Novelty43%
AI Score25

4 Papers

LGApr 5, 2024
Lossless and Near-Lossless Compression for Foundation Models

Moshik Hershcovitch, Leshem Choshen, Andrew Wood et al.

With the growth of model sizes and scale of their deployment, their sheer size burdens the infrastructure requiring more network and more storage to accommodate these. While there is a vast literature about reducing model sizes, we investigate a more traditional type of compression -- one that compresses the model to a smaller form and is coupled with a decompression algorithm that returns it to its original size -- namely lossless compression. Somewhat surprisingly, we show that such lossless compression can gain significant network and storage reduction on popular models, at times reducing over $50\%$ of the model size. We investigate the source of model compressibility, introduce compression variants tailored for models and categorize models to compressibility groups. We also introduce a tunable lossy compression technique that can further reduce size even on the less compressible models with little to no effect on the model accuracy. We estimate that these methods could save over an ExaByte per month of network traffic downloaded from a large model hub like HuggingFace.

LGNov 7, 2024
ZipNN: Lossless Compression for AI Models

Moshik Hershcovitch, Andrew Wood, Leshem Choshen et al.

With the growth of model sizes and the scale of their deployment, their sheer size burdens the infrastructure requiring more network and more storage to accommodate these. While there is a vast model compression literature deleting parts of the model weights for faster inference, we investigate a more traditional type of compression - one that represents the model in a compact form and is coupled with a decompression algorithm that returns it to its original form and size - namely lossless compression. We present ZipNN a lossless compression tailored to neural networks. Somewhat surprisingly, we show that specific lossless compression can gain significant network and storage reduction on popular models, often saving 33% and at times reducing over 50% of the model size. We investigate the source of model compressibility and introduce specialized compression variants tailored for models that further increase the effectiveness of compression. On popular models (e.g. Llama 3) ZipNN shows space savings that are over 17% better than vanilla compression while also improving compression and decompression speeds by 62%. We estimate that these methods could save over an ExaByte per month of network traffic downloaded from a large model hub like Hugging Face.

CRAug 8, 2018
It Takes Two to #MeToo - Using Enclaves to Build Autonomous Trusted Systems

Danny Harnik, Paula Ta-Shma, Eliad Tsfadia

We provide enhanced security against insider attacks in services that manage extremely sensitive data. One example is a #MeToo use case where sexual harassment complaints are reported but only revealed when another complaint is filed against the same perpetrator. Such a service places tremendous trust on service operators which our work aims to relieve. To this end we introduce a new autonomous data management concept which transfers responsibility for the sensitive data from administrators to secure and verifiable hardware. The main idea is to manage all data access via a cluster of autonomous computation agents running inside Intel SGX enclaves. These EConfidante agents share a secret data key which is unknown to any external entity, including the data service administrators, thus eliminating many opportunities for data exposure. In this paper we describe a detailed design of the EConfidante system, its flow and how it is managed and implemented. Our #MeToo design also uses an immutable distributed ledger which is built using components from a Blockchain framework. We implemented a proof of concept of our system for the #MeToo use case and analyze its security properties and implementation details.

CRJun 28, 2018
Securing the Storage Data Path with SGX Enclaves

Danny Harnik, Eliad Tsfadia, Doron Chen et al.

We explore the use of SGX enclaves as a means to improve the security of handling keys and data in storage systems. We study two main configurations for SGX computations, as they apply to performing data-at-rest encryption in a storage system. The first configuration aims to protect the encryption keys used in the encryption process. The second configuration aims to protect both the encryption keys and the data, thus providing end-to-end security of the entire data path. Our main contribution is an evaluation of the viability of SGX for data-at-rest encryption from a performance perspective and an understanding of the details that go into using enclaves in a performance sensitive environment. Our tests paint a complex picture: On the one hand SGX can indeed achieve high encryption and decryption throughput, comparable to running without SGX. On the other hand, there are many subtleties to achieving such performance and careful design choices and testing are required.