CMA-R:Causal Mediation Analysis for Explaining Rumour Detection
This provides interpretability for blackbox rumour detection systems, which is incremental as it applies an existing causal method to a specific domain.
The paper tackles the problem of explaining neural models for rumour detection on Twitter by applying causal mediation analysis to reveal causal impacts of tweets and words, finding that CMA-R identifies salient tweets that agree with human judgements and highlights impactful words for interpretability.
We apply causal mediation analysis to explain the decision-making process of neural models for rumour detection on Twitter. Interventions at the input and network level reveal the causal impacts of tweets and words in the model output. We find that our approach CMA-R -- Causal Mediation Analysis for Rumour detection -- identifies salient tweets that explain model predictions and show strong agreement with human judgements for critical tweets determining the truthfulness of stories. CMA-R can further highlight causally impactful words in the salient tweets, providing another layer of interpretability and transparency into these blackbox rumour detection systems. Code is available at: https://github.com/ltian678/cma-r.