PEACH: Pretrained-embedding Explanation Across Contextual and Hierarchical Structure
This work addresses the need for interpretability in NLP text classification, offering a tool for researchers and practitioners to understand model decisions and debug datasets, though it is incremental as it builds on existing tree-based and embedding methods.
The authors tackled the problem of explaining text classification decisions by proposing PEACH, a tree-based method that uses pretrained contextual embeddings to generate human-interpretable explanations, and demonstrated its utility in analyzing embeddings and debugging datasets while achieving performance comparable to or better than pretrained models.
In this work, we propose a novel tree-based explanation technique, PEACH (Pretrained-embedding Explanation Across Contextual and Hierarchical Structure), that can explain how text-based documents are classified by using any pretrained contextual embeddings in a tree-based human-interpretable manner. Note that PEACH can adopt any contextual embeddings of the PLMs as a training input for the decision tree. Using the proposed PEACH, we perform a comprehensive analysis of several contextual embeddings on nine different NLP text classification benchmarks. This analysis demonstrates the flexibility of the model by applying several PLM contextual embeddings, its attribute selections, scaling, and clustering methods. Furthermore, we show the utility of explanations by visualising the feature selection and important trend of text classification via human-interpretable word-cloud-based trees, which clearly identify model mistakes and assist in dataset debugging. Besides interpretability, PEACH outperforms or is similar to those from pretrained models.