LGMLAug 7, 2018

Rethinking Numerical Representations for Deep Neural Networks

arXiv:1808.02513v111 citations
Originality Incremental advance
AI Analysis

This work addresses computational bottlenecks for deep learning practitioners by enabling faster inference with minimal accuracy loss, representing an incremental improvement in hardware-aware optimization.

The paper tackles the problem of computational inefficiency in deep neural networks by exploring unconventional narrow-precision floating-point representations, achieving an average speedup of 7.6x with less than 1% accuracy degradation on models like GoogLeNet and VGG.

With ever-increasing computational demand for deep learning, it is critical to investigate the implications of the numeric representation and precision of DNN model weights and activations on computational efficiency. In this work, we explore unconventional narrow-precision floating-point representations as it relates to inference accuracy and efficiency to steer the improved design of future DNN platforms. We show that inference using these custom numeric representations on production-grade DNNs, including GoogLeNet and VGG, achieves an average speedup of 7.6x with less than 1% degradation in inference accuracy relative to a state-of-the-art baseline platform representing the most sophisticated hardware using single-precision floating point. To facilitate the use of such customized precision, we also present a novel technique that drastically reduces the time required to derive the optimal precision configuration.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes