LGNENov 11, 2019

A Computing Kernel for Network Binarization on PyTorch

arXiv:1911.04477v14 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of deploying deep neural networks on low-power devices by providing a practical tool for researchers and engineers using PyTorch, though it is incremental as it builds on existing binarization techniques.

The paper tackled the lack of a computing kernel for network binarization on PyTorch, which is used for model compression and acceleration, and developed a kernel that accelerates inference by 3 times on GPU and 4.5 times on CPU compared to a control group.

Deep Neural Networks have now achieved state-of-the-art results in a wide range of tasks including image classification, object detection and so on. However, they are both computation consuming and memory intensive, making them difficult to deploy on low-power devices. Network binarization is one of the existing effective techniques for model compression and acceleration, but there is no computing kernel yet to support it on PyTorch. In this paper we developed a computing kernel supporting 1-bit xnor and bitcount computation on PyTorch. Experimental results show that our kernel could accelerate the inference of the binarized neural network by 3 times in GPU and by 4.5 times in CPU compared with the control group.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes