Interpretable Text Classification Using CNN and Max-pooling
This work addresses interpretability in text classification for users needing transparent AI decisions, but it is incremental as it builds on existing CNN methods.
The authors tackled the problem of interpreting deep neural networks for text classification by proposing convolution attribution and n-gram feature analysis for a CNN model, achieving interpretability without performance loss in multi-sentence scenarios.
Deep neural networks have been widely used in text classification. However, it is hard to interpret the neural models due to the complicate mechanisms. In this work, we study the interpretability of a variant of the typical text classification model which is based on convolutional operation and max-pooling layer. Two mechanisms: convolution attribution and n-gram feature analysis are proposed to analyse the process procedure for the CNN model. The interpretability of the model is reflected by providing posterior interpretation for neural network predictions. Besides, a multi-sentence strategy is proposed to enable the model to beused in multi-sentence situation without loss of performance and interpret ability. We evaluate the performance of the model on several classification tasks and justify the interpretable performance with some case studies.