MATIC: Learning Around Errors for Efficient Low-Voltage Neural Network Accelerators
This addresses energy efficiency for hardware accelerators in AI applications, representing an incremental improvement by adapting existing methods to a specific bottleneck.
The paper tackles the problem of improving energy-efficiency in neural network accelerators, particularly for fully-connected DNNs, by proposing MATIC, a methodology that enables aggressive voltage scaling of weight memories, resulting in up to 3.3x total energy reduction or 18.6x application error reduction.
As a result of the increasing demand for deep neural network (DNN)-based services, efforts to develop dedicated hardware accelerators for DNNs are growing rapidly. However,while accelerators with high performance and efficiency on convolutional deep neural networks (Conv-DNNs) have been developed, less progress has been made with regards to fully-connected DNNs (FC-DNNs). In this paper, we propose MATIC (Memory Adaptive Training with In-situ Canaries), a methodology that enables aggressive voltage scaling of accelerator weight memories to improve the energy-efficiency of DNN accelerators. To enable accurate operation with voltage overscaling, MATIC combines the characteristics of destructive SRAM reads with the error resilience of neural networks in a memory-adaptive training process. Furthermore, PVT-related voltage margins are eliminated using bit-cells from synaptic weights as in-situ canaries to track runtime environmental variation. Demonstrated on a low-power DNN accelerator that we fabricate in 65 nm CMOS, MATIC enables up to 60-80 mV of voltage overscaling (3.3x total energy reduction versus the nominal voltage), or 18.6x application error reduction.