Ashwin Bhat

AROct 19, 2022

Gradient Backpropagation based Feature Attribution to Enable Explainable-AI on the Edge

Ashwin Bhat, Adou Sangbone Assoa, Arijit Raychowdhury

There has been a recent surge in the field of Explainable AI (XAI) which tackles the problem of providing insights into the behavior of black-box machine learning models. Within this field, \textit{feature attribution} encompasses methods which assign relevance scores to input features and visualize them as a heatmap. Designing flexible accelerators for multiple such algorithms is challenging since the hardware mapping of these algorithms has not been studied yet. In this work, we first analyze the dataflow of gradient backpropagation based feature attribution algorithms to determine the resource overhead required over inference. The gradient computation is optimized to minimize the memory overhead. Second, we develop a High-Level Synthesis (HLS) based configurable FPGA design that is targeted for edge devices and supports three feature attribution algorithms. Tile based computation is employed to maximally use on-chip resources while adhering to the resource constraints. Representative CNNs are trained on CIFAR-10 dataset and implemented on multiple Xilinx FPGAs using 16-bit fixed-point precision demonstrating flexibility of our library. Finally, through efficient reuse of allocated hardware resources, our design methodology demonstrates a pathway to repurpose inference accelerators to support feature attribution with minimal overhead, thereby enabling real-time XAI on the edge.

LGFeb 22, 2023

Non-Uniform Interpolation in Integrated Gradients for Low-Latency Explainable-AI

Ashwin Bhat, Arijit Raychowdhury

There has been a surge in Explainable-AI (XAI) methods that provide insights into the workings of Deep Neural Network (DNN) models. Integrated Gradients (IG) is a popular XAI algorithm that attributes relevance scores to input features commensurate with their contribution to the model's output. However, it requires multiple forward \& backward passes through the model. Thus, compared to a single forward-pass inference, there is a significant computational overhead to generate the explanation which hinders real-time XAI. This work addresses the aforementioned issue by accelerating IG with a hardware-aware algorithm optimization. We propose a novel non-uniform interpolation scheme to compute the IG attribution scores which replaces the baseline uniform interpolation. Our algorithm significantly reduces the total interpolation steps required without adversely impacting convergence. Experiments on the ImageNet dataset using a pre-trained InceptionV3 model demonstrate \textit{2.6-3.6}$\times$ performance speedup on GPU systems for iso-convergence. This includes the minimal \textit{0.2-3.2}\% latency overhead introduced by the pre-processing stage of computing the non-uniform interpolation step-sizes.

Ashwin Bhat

2 Papers