LGARNov 3, 2017

SparseNN: An Energy-Efficient Neural Network Accelerator Exploiting Input and Output Sparsity

arXiv:1711.01263v138 citations
Originality Incremental advance
AI Analysis

This work addresses energy efficiency for hardware accelerators in AI, offering incremental improvements by exploiting both input and output sparsity.

The paper tackled the challenge of high computation and energy demands in deep neural networks by exploiting activation sparsity, resulting in a 10%-70% throughput improvement and around 50% power reduction compared to state-of-the-art accelerators.

Contemporary Deep Neural Network (DNN) contains millions of synaptic connections with tens to hundreds of layers. The large computation and memory requirements pose a challenge to the hardware design. In this work, we leverage the intrinsic activation sparsity of DNN to substantially reduce the execution cycles and the energy consumption. An end-to-end training algorithm is proposed to develop a lightweight run-time predictor for the output activation sparsity on the fly. From our experimental results, the computation overhead of the prediction phase can be reduced to less than 5% of the original feedforward phase with negligible accuracy loss. Furthermore, an energy-efficient hardware architecture, SparseNN, is proposed to exploit both the input and output sparsity. SparseNN is a scalable architecture with distributed memories and processing elements connected through a dedicated on-chip network. Compared with the state-of-the-art accelerators which only exploit the input sparsity, SparseNN can achieve a 10%-70% improvement in throughput and a power reduction of around 50%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes