Hasindu Gamaarachchi

2papers

2 Papers

ARSep 9, 2022Code
ApproxTrain: Fast Simulation of Approximate Multipliers for DNN Training and Inference

Jing Gong, Hassaan Saadat, Hasindu Gamaarachchi et al.

Edge training of Deep Neural Networks (DNNs) is a desirable goal for continuous learning; however, it is hindered by the enormous computational power required by training. Hardware approximate multipliers have shown their effectiveness for gaining resource-efficiency in DNN inference accelerators; however, training with approximate multipliers is largely unexplored. To build resource efficient accelerators with approximate multipliers supporting DNN training, a thorough evaluation of training convergence and accuracy for different DNN architectures and different approximate multipliers is needed. This paper presents ApproxTrain, an open-source framework that allows fast evaluation of DNN training and inference using simulated approximate multipliers. ApproxTrain is as user-friendly as TensorFlow (TF) and requires only a high-level description of a DNN architecture along with C/C++ functional models of the approximate multiplier. We improve the speed of the simulation at the multiplier level by using a novel LUT-based approximate floating-point (FP) multiplier simulator on GPU (AMSim). ApproxTrain leverages CUDA and efficiently integrates AMSim into the TensorFlow library, in order to overcome the absence of native hardware approximate multiplier in commercial GPUs. We use ApproxTrain to evaluate the convergence and accuracy of DNN training with approximate multipliers for small and large datasets (including ImageNet) using LeNets and ResNets architectures. The evaluations demonstrate similar convergence behavior and negligible change in test accuracy compared to FP32 and bfloat16 multipliers. Compared to CPU-based approximate multiplier simulations in training and inference, the GPU-accelerated ApproxTrain is more than 2500x faster. Based on highly optimized closed-source cuDNN/cuBLAS libraries with native hardware multipliers, the original TensorFlow is only 8x faster than ApproxTrain.

CRJan 3, 2018
Power Analysis Based Side Channel Attack

Hasindu Gamaarachchi, Harsha Ganegoda

Power analysis is a branch of side channel attacks where power consumption data is used as the side channel to attack the system. First using a device like an oscilloscope power traces are collected when the cryptographic device is doing the cryptographic operation. Then those traces are statistically analysed using methods such as Correlation Power Analysis (CPA) to derive the secret key of the system. Being possible to break Advanced Encryption Standard (AES) in few minutes, power analysis attacks have become a serious security issue for cryptographic devices such as smart card. As the first phase of our project, we build a testbed for doing research on power analysis attacks. As power analysis is a practical type of attack in order to do any research, a testbed is the first requirement. Since building a test bed is a complicated process, having a pre-built testbed would save the time of future researchers. The second phase of our project is to attack the latest cryptographic algorithm called Speck which has been released by National Security Agency (NSA) for use in embedded systems. In spite it has lot of differences to AES making impossible to directly use the power analysis approach used for AES, we introduce novel approaches to break Speck in less than an hour. In the third phase of the project, we select few already introduced countermeasures and practically attack them on our testbed to do a comparative analysis. We show that software countermeasures such as random instruction injection and randomly shuffling S-boxes are good enough for their simplicity and cost. But we identify the possible threat due to the problem of generating a good seed for the pseudo-random algorithm running on the microcontroller. We attempt to address this issue by using a hardware-based true random generator that amplifies a random electrical signal and samples to generate a proper seed.