Explaining high-dimensional text classifiers
This addresses the limitation of classic explainability tools for high-dimensional inputs and neural network classifiers, which is an incremental improvement for users needing interpretability in text and security domains.
The paper tackles the problem of explaining high-dimensional text classifiers by introducing a new explainability method based on theoretically proven high-dimensional properties in neural networks, applying it to sentiment analysis on the IMDB dataset and malware detection on a PowerShell scripts dataset.
Explainability has become a valuable tool in the last few years, helping humans better understand AI-guided decisions. However, the classic explainability tools are sometimes quite limited when considering high-dimensional inputs and neural network classifiers. We present a new explainability method using theoretically proven high-dimensional properties in neural network classifiers. We present two usages of it: 1) On the classical sentiment analysis task for the IMDB reviews dataset, and 2) our Malware-Detection task for our PowerShell scripts dataset.