SPARCVMay 3, 2020

Lupulus: A Flexible Hardware Accelerator for Neural Networks

arXiv:2005.01016v1
AI Analysis

This work addresses the need for programmable hardware to support evolving neural network demands, though it is incremental as it builds on existing accelerator designs.

The authors tackled the problem of high computational and memory requirements in neural networks by developing Lupulus, a flexible hardware accelerator that achieved a peak performance of 380 GOPS/GHz with latencies of 21.4ms for AlexNet and 183.6ms for VGG-16 convolutional layers.

Neural networks have become indispensable for a wide range of applications, but they suffer from high computational- and memory-requirements, requiring optimizations from the algorithmic description of the network to the hardware implementation. Moreover, the high rate of innovation in machine learning makes it important that hardware implementations provide a high level of programmability to support current and future requirements of neural networks. In this work, we present a flexible hardware accelerator for neural networks, called Lupulus, supporting various methods for scheduling and mapping of operations onto the accelerator. Lupulus was implemented in a 28nm FD-SOI technology and demonstrates a peak performance of 380 GOPS/GHz with latencies of 21.4ms and 183.6ms for the convolutional layers of AlexNet and VGG-16, respectively.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes