Adaptive Block Floating-Point for Analog Deep Learning Hardware
This addresses the problem of maintaining accuracy in energy-efficient analog hardware for deep neural network inference, representing an incremental improvement with a novel finetuning method.
The paper tackles the accuracy penalty from precision loss in analog mixed-signal deep learning hardware by introducing an adaptive block floating-point representation and amplification method, achieving less than 1% accuracy loss compared to FLOAT32 on MLPerf benchmarks.
Analog mixed-signal (AMS) devices promise faster, more energy-efficient deep neural network (DNN) inference than their digital counterparts. However, recent studies show that DNNs on AMS devices with fixed-point numbers can incur an accuracy penalty because of precision loss. To mitigate this penalty, we present a novel AMS-compatible adaptive block floating-point (ABFP) number representation. We also introduce amplification (or gain) as a method for increasing the accuracy of the number representation without increasing the bit precision of the output. We evaluate the effectiveness of ABFP on the DNNs in the MLPerf datacenter inference benchmark -- realizing less than $1\%$ loss in accuracy compared to FLOAT32. We also propose a novel method of finetuning for AMS devices, Differential Noise Finetuning (DNF), which samples device noise to speed up finetuning compared to conventional Quantization-Aware Training.