Extending Logic Explained Networks to Text Classification
This work addresses the need for better local explainability in text classification for users requiring interpretable AI models, but it is incremental as it extends an existing method to a new domain with specific improvements.
The paper tackled the problem of generating local explanations for text classification using Logic Explained Networks, proposing LENp which improves local explanations by perturbing input words. The results show that LENp outperforms LIME in sensitivity and faithfulness, and logic explanations are more useful and user-friendly based on a human survey.
Recently, Logic Explained Networks (LENs) have been proposed as explainable-by-design neural models providing logic explanations for their predictions. However, these models have only been applied to vision and tabular data, and they mostly favour the generation of global explanations, while local ones tend to be noisy and verbose. For these reasons, we propose LENp, improving local explanations by perturbing input words, and we test it on text classification. Our results show that (i) LENp provides better local explanations than LIME in terms of sensitivity and faithfulness, and (ii) logic explanations are more useful and user-friendly than feature scoring provided by LIME as attested by a human survey.