CLOct 18, 2023

Rather a Nurse than a Physician -- Contrastive Explanations under Investigation

arXiv:2310.11906v114 citationsh-index: 11
Originality Synthesis-oriented
AI Analysis

This work addresses a foundational question in explainable AI for researchers, showing that the assumed superiority of contrastive explanations is not empirically supported, which is incremental but clarifies a key claim in the field.

The study investigated whether contrastive explanations align better with human reasoning than non-contrastive ones, finding no significant difference in agreement between model-based and human rationales across multiple datasets and models.

Contrastive explanations, where one decision is explained in contrast to another, are supposed to be closer to how humans explain a decision than non-contrastive explanations, where the decision is not necessarily referenced to an alternative. This claim has never been empirically validated. We analyze four English text-classification datasets (SST2, DynaSent, BIOS and DBpedia-Animals). We fine-tune and extract explanations from three different models (RoBERTa, GTP-2, and T5), each in three different sizes and apply three post-hoc explainability methods (LRP, GradientxInput, GradNorm). We furthermore collect and release human rationale annotations for a subset of 100 samples from the BIOS dataset for contrastive and non-contrastive settings. A cross-comparison between model-based rationales and human annotations, both in contrastive and non-contrastive settings, yields a high agreement between the two settings for models as well as for humans. Moreover, model-based explanations computed in both settings align equally well with human rationales. Thus, we empirically find that humans do not necessarily explain in a contrastive manner.9 pages, long paper at ACL 2022 proceedings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes