ITLGApr 9, 2018

Universal and Succinct Source Coding of Deep Neural Networks

arXiv:1804.02800v2
AI Analysis

This addresses storage limitations for scaling deep learning as a service and on-device intelligence, offering a novel compression approach.

The paper tackles the problem of large storage requirements for deep neural networks by proposing a universal lossless compression method that exploits permutation invariance in network layers, enabling inference without full decompression and achieving near-entropy bound compression rates.

Deep neural networks have shown incredible performance for inference tasks in a variety of domains. Unfortunately, most current deep networks are enormous cloud-based structures that require significant storage space, which limits scaling of deep learning as a service (DLaaS) and use for on-device intelligence. This paper is concerned with finding universal lossless compressed representations of deep feedforward networks with synaptic weights drawn from discrete sets, and directly performing inference without full decompression. The basic insight that allows less rate than naive approaches is recognizing that the bipartite graph layers of feedforward networks have a kind of permutation invariance to the labeling of nodes, in terms of inferential operation. We provide efficient algorithms to dissipate this irrelevant uncertainty and then use arithmetic coding to nearly achieve the entropy bound in a universal manner. We also provide experimental results of our approach on several standard datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes