TensorHub: Rethinking AI Model Hub with Tensor-Centric Compression
For AI model hub operators and users, TensorHub reduces storage costs without sacrificing model usability, though the novelty is incremental.
TensorHub addresses storage and distribution challenges in AI model hubs by using tensor-centric deduplication and compression, achieving substantial storage savings with minimal overhead on real-world model repositories.
Modern AI models are growing rapidly in size and redundancy, leading to significant storage and distribution challenges in model hubs. We present TensorHub, a tensor-centric system for reducing storage overhead through fine-grained deduplication and compression. TensorHub leverages tensor-level fingerprinting and clustering to identify redundancy across models without requiring annotations. Our design enables efficient storage reduction while preserving model usability and performance. Experiments on real-world model repositories demonstrate substantial storage savings with minimal overhead.