Do Human Rationales Improve Machine Explanations?
This work connects learning with rationales to explainable AI, addressing the need for better machine explanations in text classification, though it is incremental as it builds on existing attention methods.
The paper tackles the problem of whether human-provided rationales can improve machine-generated explanations, showing that supervised attention in CNN-based text classification yields explanations judged superior by human evaluators compared to unsupervised attention.
Work on "learning with rationales" shows that humans providing explanations to a machine learning system can improve the system's predictive accuracy. However, this work has not been connected to work in "explainable AI" which concerns machines explaining their reasoning to humans. In this work, we show that learning with rationales can also improve the quality of the machine's explanations as evaluated by human judges. Specifically, we present experiments showing that, for CNN- based text classification, explanations generated using "supervised attention" are judged superior to explanations generated using normal unsupervised attention.