Pruning Literals for Highly Efficient Explainability at Word Level
This work addresses the need for more interpretable NLP models for users and practitioners, though it is incremental as it builds on the existing Tsetlin Machine framework.
The paper tackles the problem of limited explainability in NLP models by proposing a post-hoc pruning method for Tsetlin Machines to eliminate random literals in clauses, making them more interpretable at the word level. The result shows that the pruned model's attention map aligns better with human attention maps and improves accuracy by up to 4-9% on some test data.
Designing an explainable model becomes crucial now for Natural Language Processing(NLP) since most of the state-of-the-art machine learning models provide a limited explanation for the prediction. In the spectrum of an explainable model, Tsetlin Machine(TM) is promising because of its capability of providing word-level explanation using proposition logic. However, concern rises over the elaborated combination of literals (propositional logic) in the clause that makes the model difficult for humans to comprehend, despite having a transparent learning process. In this paper, we design a post-hoc pruning of clauses that eliminate the randomly placed literals in the clause thereby making the model more efficiently interpretable than the vanilla TM. Experiments on the publicly available YELP-HAT Dataset demonstrate that the proposed pruned TM's attention map aligns more with the human attention map than the vanilla TM's attention map. In addition, the pairwise similarity measure also surpasses the attention map-based neural network models. In terms of accuracy, the proposed pruning method does not degrade the accuracy significantly but rather enhances the performance up to 4% to 9% in some test data.