CVJul 15, 2017

Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration

arXiv:1707.04693v124 citations
Originality Incremental advance
AI Analysis

This work addresses efficiency challenges for deploying CNNs on resource-constrained hardware like mobile devices, though it appears incremental as it builds on existing binarized CNN methods.

The paper tackles the problem of high computational and memory costs in convolutional neural networks for embedded platforms by proposing BCNN with Separable Filters, which reduces memory usage by 17% and execution time by 31.3% on CIFAR-10 with minor accuracy loss.

State-of-the-art convolutional neural networks are enormously costly in both compute and memory, demanding massively parallel GPUs for execution. Such networks strain the computational capabilities and energy available to embedded and mobile processing platforms, restricting their use in many important applications. In this paper, we push the boundaries of hardware-effective CNN design by proposing BCNN with Separable Filters (BCNNw/SF), which applies Singular Value Decomposition (SVD) on BCNN kernels to further reduce computational and storage complexity. To enable its implementation, we provide a closed form of the gradient over SVD to calculate the exact gradient with respect to every binarized weight in backward propagation. We verify BCNNw/SF on the MNIST, CIFAR-10, and SVHN datasets, and implement an accelerator for CIFAR-10 on FPGA hardware. Our BCNNw/SF accelerator realizes memory savings of 17% and execution time reduction of 31.3% compared to BCNN with only minor accuracy sacrifices.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes