Distilling neural networks into skipgram-level decision lists
This work addresses the need for scalable and parameter-sensitive explanations in NLP, particularly for clinical and sentiment analysis tasks, though it is incremental in building on prior rule-based methods.
The authors tackled the problem of explaining recurrent neural networks by proposing a pipeline that distills them into decision lists over skipgrams, achieving high explanation fidelity and interpretable rules on synthetic and real-world datasets.
Several previous studies on explanation for recurrent neural networks focus on approaches that find the most important input segments for a network as its explanations. In that case, the manner in which these input segments combine with each other to form an explanatory pattern remains unknown. To overcome this, some previous work tries to find patterns (called rules) in the data that explain neural outputs. However, their explanations are often insensitive to model parameters, which limits the scalability of text explanations. To overcome these limitations, we propose a pipeline to explain RNNs by means of decision lists (also called rules) over skipgrams. For evaluation of explanations, we create a synthetic sepsis-identification dataset, as well as apply our technique on additional clinical and sentiment analysis datasets. We find that our technique persistently achieves high explanation fidelity and qualitatively interpretable rules.