Fixflow: A Framework to Evaluate Fixed-point Arithmetic in Light-Weight CNN Inference
This work addresses the need for efficient CNN inference on IoT devices by providing a tool to evaluate hardware-level fixed-point arithmetic, but it is incremental as it builds on existing quantization and re-training techniques.
The paper tackles the problem of evaluating how different fixed-point hardware units affect CNN inference accuracy in resource-constrained IoT devices, and it introduces the Fixflow framework to assess these effects, showing that hardware-level methods, especially with low precision, can significantly change classification accuracy.
Convolutional neural networks (CNN) are widely used in resource-constrained devices in IoT applications. In order to reduce the computational complexity and memory footprint, the resource-constrained devices use fixed-point representation. This representation consumes less area and energy in hardware with similar classification accuracy compared to the floating-point ones. However, to employ the low-precision fixed-point representation, various considerations to gain high accuracy are required. Although many quantization and re-training techniques are proposed to improve the inference accuracy, these approaches are time-consuming and require access to the entire dataset. This paper investigates the effect of different fixed-point hardware units on CNN inference accuracy. To this end, we provide a framework called Fixflow to evaluate the effect of fixed-point computations performed at hardware level on CNN classification accuracy. We can employ different fixed-point considerations at the hardware accelerators.This includes rounding methods and adjusting the precision of the fixed-point operation's result. Fixflow can determine the impact of employing different arithmetic units (such as truncated multipliers) on CNN classification accuracy. Moreover, we evaluate the energy and area consumption of these units in hardware accelerators. We perform experiments on two common MNIST and CIFAR-10 datasets. Our results show that employing different methods at the hardware level specially with low-precision, can significantly change the classification accuracy.