CLLGMay 10, 2024

Explaining Text Similarity in Transformer Models

arXiv:2405.06604v133 citationsh-index: 1NAACL
Originality Incremental advance
AI Analysis

This work addresses the need for explainability in unsupervised NLP applications like information retrieval, though it is incremental as it builds on existing explainable AI methods.

The paper tackled the problem of understanding and explaining predictions in Transformer-based similarity models for NLP, by applying BiLRP to investigate feature interactions and validating explanations in three corpus-level use cases, such as grammatical interactions and biomedical text retrieval.

As Transformers have become state-of-the-art models for natural language processing (NLP) tasks, the need to understand and explain their predictions is increasingly apparent. Especially in unsupervised applications, such as information retrieval tasks, similarity models built on top of foundation model representations have been widely applied. However, their inner prediction mechanisms have mostly remained opaque. Recent advances in explainable AI have made it possible to mitigate these limitations by leveraging improved explanations for Transformers through layer-wise relevance propagation (LRP). Using BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, we investigate which feature interactions drive similarity in NLP models. We validate the resulting explanations and demonstrate their utility in three corpus-level use cases, analyzing grammatical interactions, multilingual semantics, and biomedical text retrieval. Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes