Response Characterization for Auditing Cell Dynamics in Long Short-term Memory Networks
This work addresses the interpretability challenge in deep learning for researchers and practitioners, though it appears incremental as it builds on existing response characterization methods.
The paper tackles the problem of interpreting recurrent neural networks, specifically LSTMs, at the cellular level by introducing a method that analyzes hidden state dynamics using interpretable metrics, enabling identification of key neurons and quantification of their impact on test accuracy through ablation analysis.
In this paper, we introduce a novel method to interpret recurrent neural networks (RNNs), particularly long short-term memory networks (LSTMs) at the cellular level. We propose a systematic pipeline for interpreting individual hidden state dynamics within the network using response characterization methods. The ranked contribution of individual cells to the network's output is computed by analyzing a set of interpretable metrics of their decoupled step and sinusoidal responses. As a result, our method is able to uniquely identify neurons with insightful dynamics, quantify relationships between dynamical properties and test accuracy through ablation analysis, and interpret the impact of network capacity on a network's dynamical distribution. Finally, we demonstrate generalizability and scalability of our method by evaluating a series of different benchmark sequential datasets.