Towards Gradient-based Time-Series Explanations through a SpatioTemporal Attention Network
This work addresses the need for explainable AI in time-series analysis, specifically for medical activity recognition, but it appears incremental as it applies existing gradient-based methods to a new model.
The paper tackled the problem of identifying important frames in time-series video data for activity classification using a transformer-based spatiotemporal attention network (STAN) and gradient-based XAI techniques, demonstrating its potential in experiments on four medically relevant activity datasets.
In this paper, we explore the feasibility of using a transformer-based, spatiotemporal attention network (STAN) for gradient-based time-series explanations. First, we trained the STAN model for video classifications using the global and local views of data and weakly supervised labels on time-series data (i.e. the type of an activity). We then leveraged a gradient-based XAI technique (e.g. saliency map) to identify salient frames of time-series data. According to the experiments using the datasets of four medically relevant activities, the STAN model demonstrated its potential to identify important frames of videos.