LG DS OC MLJan 1, 2020

Lossless Compression of Deep Neural Networks

Thiago Serra, Abhinav Kumar, Srikumar Ramalingam

arXiv:2001.00218v320.861 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of efficient neural network deployment for applications in mobile and resource-constrained environments, representing an incremental improvement in compression techniques.

The paper tackles the challenge of deploying large neural networks on devices with limited computational resources by introducing LEO, a lossless compression algorithm that removes units and layers without changing the network's output, using MILP to identify ReLUs with linear behavior and L1 regularization to induce this behavior during training.

Deep neural networks have been successful in many predictive modeling tasks, such as image and language recognition, where large neural networks are often used to obtain good accuracy. Consequently, it is challenging to deploy these networks under limited computational resources, such as in mobile devices. In this work, we introduce an algorithm that removes units and layers of a neural network while not changing the output that is produced, which thus implies a lossless compression. This algorithm, which we denote as LEO (Lossless Expressiveness Optimization), relies on Mixed-Integer Linear Programming (MILP) to identify Rectified Linear Units (ReLUs) with linear behavior over the input domain. By using L1 regularization to induce such behavior, we can benefit from training over a larger architecture than we would later use in the environment where the trained neural network is deployed.

View on arXiv PDF

Similar