CLSep 21, 2018

Understanding Convolutional Neural Networks for Text Classification

Alon Jacovi, Oren Sar Shalom, Yoav Goldberg

arXiv:1809.08037v332.81152 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This addresses the interpretability problem for CNNs in NLP, providing insights for researchers and practitioners, though it is incremental as it builds on existing CNN methods.

The paper analyzes how Convolutional Neural Networks (CNNs) process and classify text, showing that filters act as ngram detectors with different semantic classes and that global max-pooling separates important ngrams, leading to applications in model and prediction interpretability.

We present an analysis into the inner workings of Convolutional Neural Networks (CNNs) for processing text. CNNs used for computer vision can be interpreted by projecting filters into image space, but for discrete sequence inputs CNNs remain a mystery. We aim to understand the method by which the networks process and classify text. We examine common hypotheses to this problem: that filters, accompanied by global max-pooling, serve as ngram detectors. We show that filters may capture several different semantic classes of ngrams by using different activation patterns, and that global max-pooling induces behavior which separates important ngrams from the rest. Finally, we show practical use cases derived from our findings in the form of model interpretability (explaining a trained model by deriving a concrete identity for each filter, bridging the gap between visualization tools in vision tasks and NLP) and prediction interpretability (explaining predictions). Code implementation is available online at github.com/sayaendo/interpreting-cnn-for-text.

View on arXiv PDF Code

Similar