NEOct 12, 2018

Training Deep Neural Network in Limited Precision

Hyunsun Park, Jun Haeng Lee, Youngmin Oh, Sangwon Ha, Seungwon Lee

arXiv:1810.05486v14.88 citations

Originality Incremental advance

AI Analysis

This work addresses energy and resource efficiency for deep learning applications, but it is incremental as it builds on existing low-precision training methods.

The paper tackles the problem of training deep neural networks with limited precision by addressing gradient loss during parameter updates and backpropagation through softmax layers, achieving effective low-precision training across various network architectures and benchmarks.

Energy and resource efficient training of DNNs will greatly extend the applications of deep learning. However, there are three major obstacles which mandate accurate calculation in high precision. In this paper, we tackle two of them related to the loss of gradients during parameter update and backpropagation through a softmax nonlinearity layer in low precision training. We implemented SGD with Kahan summation by employing an additional parameter to virtually extend the bit-width of the parameters for a reliable parameter update. We also proposed a simple guideline to help select the appropriate bit-width for the last FC layer followed by a softmax nonlinearity layer. It determines the lower bound of the required bit-width based on the class size of the dataset. Extensive experiments on various network architectures and benchmarks verifies the effectiveness of the proposed technique for low precision training.

View on arXiv PDF

Similar