NE CV LGJan 28, 2019

A Simple Method to Reduce Off-chip Memory Accesses on Convolutional Neural Networks

Doyun Kim, Kyoung-Young Kim, Sangsoo Ko, Sanghyuck Ha

arXiv:1901.09614v19.25 citations

Originality Incremental advance

AI Analysis

This work addresses efficiency issues for hardware implementations of neural networks, particularly in mobile or embedded systems, but is incremental as it builds on existing memory optimization techniques.

The authors tackled the problem of excessive off-chip memory accesses in convolutional neural networks by proposing a simple algorithm that maximizes on-chip memory usage in a neural processing unit, resulting in a 1/50 reduction in off-chip memory accesses and a 97.59% reduction in feature-map data transfer for Inception-V3 on Samsung's NPU.

For convolutional neural networks, a simple algorithm to reduce off-chip memory accesses is proposed by maximally utilizing on-chip memory in a neural process unit. Especially, the algorithm provides an effective way to process a module which consists of multiple branches and a merge layer. For Inception-V3 on Samsung's NPU in Exynos, our evaluation shows that the proposed algorithm makes off-chip memory accesses reduced by 1/50, and accordingly achieves 97.59 % reduction in the amount of feature-map data to be transferred from/to off-chip memory.

View on arXiv PDF

Similar