LGJan 3, 2019

HG-Caffe: Mobile and Embedded Neural Network GPU (OpenCL) Inference Engine with FP16 Supporting

arXiv:1901.00858v1

Originality Incremental advance

AI Analysis

This work addresses computational and battery efficiency problems for mobile and embedded AI applications, representing an incremental improvement in optimization.

The paper tackles the challenge of deep neural network inference on edge AI devices by presenting HG-Caffe, a GPU-based inference engine with half precision support, achieving up to 20 times speedup and reducing peak memory usage by about 80%.

Breakthroughs in the fields of deep learning and mobile system-on-chips are radically changing the way we use our smartphones. However, deep neural networks inference is still a challenging task for edge AI devices due to the computational overhead on mobile CPUs and a severe drain on the batteries. In this paper, we present a deep neural network inference engine named HG-Caffe, which supports GPUs with half precision. HG-Caffe provides up to 20 times speedup with GPUs compared to the original implementations. In addition to the speedup, the peak memory usage is also reduced to about 80%. With HG-Caffe, more innovative and fascinating mobile applications will be turned into reality.

View on arXiv PDF

Similar