LGSDASMLFeb 19, 2020

Gradient-Adjusted Neuron Activation Profiles for Comprehensive Introspection of Convolutional Speech Recognition Models

arXiv:2002.08125v1
AI Analysis

This work addresses the challenge of interpreting complex speech recognition models for researchers and practitioners, though it appears incremental by adapting introspection methods from computer vision to speech.

The authors tackled the problem of interpreting deep learning-based automatic speech recognition (ASR) models by introducing Gradient-adjusted Neuron Activation Profiles (GradNAPs) to visualize and analyze features and representations in neural networks, demonstrating the techniques on a fully-convolutional ASR model.

Deep Learning based Automatic Speech Recognition (ASR) models are very successful, but hard to interpret. To gain better understanding of how Artificial Neural Networks (ANNs) accomplish their tasks, introspection methods have been proposed. Adapting such techniques from computer vision to speech recognition is not straight-forward, because speech data is more complex and less interpretable than image data. In this work, we introduce Gradient-adjusted Neuron Activation Profiles (GradNAPs) as means to interpret features and representations in Deep Neural Networks. GradNAPs are characteristic responses of ANNs to particular groups of inputs, which incorporate the relevance of neurons for prediction. We show how to utilize GradNAPs to gain insight about how data is processed in ANNs. This includes different ways of visualizing features and clustering of GradNAPs to compare embeddings of different groups of inputs in any layer of a given network. We demonstrate our proposed techniques using a fully-convolutional ASR model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes