CLAug 4, 2024

Optimal and efficient text counterfactuals using Graph Neural Networks

arXiv:2408.01969v223 citationsh-index: 29
AI Analysis

This work addresses interpretability for users of NLP models in decision-making, though it appears incremental as it builds on existing counterfactual editing approaches.

The authors tackled the need for explainability in NLP models by proposing a framework for generating counterfactual interventions that change model predictions, achieving faster processing than state-of-the-art methods on sentiment and topic classification tasks.

As NLP models become increasingly integral to decision-making processes, the need for explainability and interpretability has become paramount. In this work, we propose a framework that achieves the aforementioned by generating semantically edited inputs, known as counterfactual interventions, which change the model prediction, thus providing a form of counterfactual explanations for the model. We test our framework on two NLP tasks - binary sentiment classification and topic classification - and show that the generated edits are contrastive, fluent and minimal, while the whole process remains significantly faster that other state-of-the-art counterfactual editors.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes