Learning Explanations from Language Data
arXiv:1808.04127v11095 citations
Originality Synthesis-oriented
AI Analysis
This work provides explanations for deep learning models in language processing, but it is incremental as it applies an existing method to a new domain.
The paper tackles the problem of explaining deep neural network classifications in the language domain by applying PatternAttribution, a method originally from vision, and finds that it generates meaningful interpretations.
PatternAttribution is a recent method, introduced in the vision domain, that explains classifications of deep neural networks. We demonstrate that it also generates meaningful interpretations in the language domain.