DC CV OHMar 5, 2019

TinBiNN: Tiny Binarized Neural Network Overlay in about 5,000 4-LUTs and 5mW

Guy G. F. Lemieux, Joe Edwards, Joel Vandergriendt, Aaron Severance, Ryan De Iaco, Abdullah Raouf, Hussein Osman, Tom Watzka, Satwant Singh

arXiv:1903.06630v13.39 citations

Originality Incremental advance

AI Analysis

This enables efficient neural network inference on low-cost FPGAs for embedded applications like person detection, but it is incremental as it builds on existing binary-weighted methods.

The paper tackles the problem of deploying neural networks on resource-constrained hardware by introducing TinBiNN, a tiny overlay for accelerating inference with 1-bit weights and 8-bit activations, achieving 13.6% error on a 10-category classifier and 0.4% error on a 1-category classifier.

Reduced-precision arithmetic improves the size, cost, power and performance of neural networks in digital logic. In convolutional neural networks, the use of 1b weights can achieve state-of-the-art error rates while eliminating multiplication, reducing storage and improving power efficiency. The BinaryConnect binary-weighted system, for example, achieves 9.9% error using floating-point activations on the CIFAR-10 dataset. In this paper, we introduce TinBiNN, a lightweight vector processor overlay for accelerating inference computations with 1b weights and 8b activations. The overlay is very small -- it uses about 5,000 4-input LUTs and fits into a low cost iCE40 UltraPlus FPGA from Lattice Semiconductor. To show this can be useful, we build two embedded 'person detector' systems by shrinking the original BinaryConnect network. The first is a 10-category classifier with a 89% smaller network that runs in 1,315ms and achieves 13.6% error. The other is a 1-category classifier that is even smaller, runs in 195ms, and has only 0.4% error. In both classifiers, the error can be attributed entirely to training and not reduced precision.

View on arXiv PDF

Similar