DCCVOHMar 5, 2019

TinBiNN: Tiny Binarized Neural Network Overlay in about 5,000 4-LUTs and 5mW

arXiv:1903.06630v19 citations
Originality Incremental advance
AI Analysis

This enables efficient neural network inference on low-cost FPGAs for embedded applications like person detection, but it is incremental as it builds on existing binary-weighted methods.

The paper tackles the problem of deploying neural networks on resource-constrained hardware by introducing TinBiNN, a tiny overlay for accelerating inference with 1-bit weights and 8-bit activations, achieving 13.6% error on a 10-category classifier and 0.4% error on a 1-category classifier.

Reduced-precision arithmetic improves the size, cost, power and performance of neural networks in digital logic. In convolutional neural networks, the use of 1b weights can achieve state-of-the-art error rates while eliminating multiplication, reducing storage and improving power efficiency. The BinaryConnect binary-weighted system, for example, achieves 9.9% error using floating-point activations on the CIFAR-10 dataset. In this paper, we introduce TinBiNN, a lightweight vector processor overlay for accelerating inference computations with 1b weights and 8b activations. The overlay is very small -- it uses about 5,000 4-input LUTs and fits into a low cost iCE40 UltraPlus FPGA from Lattice Semiconductor. To show this can be useful, we build two embedded 'person detector' systems by shrinking the original BinaryConnect network. The first is a 10-category classifier with a 89% smaller network that runs in 1,315ms and achieves 13.6% error. The other is a 1-category classifier that is even smaller, runs in 195ms, and has only 0.4% error. In both classifiers, the error can be attributed entirely to training and not reduced precision.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes