CLLGApr 12, 2022

A Comparative Study of Faithfulness Metrics for Model Interpretability Methods

arXiv:2204.05514v1660 citationsh-index: 10
Originality Incremental advance
AI Analysis

This work addresses the challenge of evaluating interpretability methods for machine learning models, which is crucial for researchers and practitioners in AI, though it is incremental as it builds on existing metrics.

The study tackled the problem of conflicting preferences among faithfulness metrics for model interpretability by introducing two assessment dimensions, diagnosticity and time complexity, and found that sufficiency and comprehensiveness metrics outperform others with higher diagnosticity and lower time complexity.

Interpretation methods to reveal the internal reasoning processes behind machine learning models have attracted increasing attention in recent years. To quantify the extent to which the identified interpretations truly reflect the intrinsic decision-making mechanisms, various faithfulness evaluation metrics have been proposed. However, we find that different faithfulness metrics show conflicting preferences when comparing different interpretations. Motivated by this observation, we aim to conduct a comprehensive and comparative study of the widely adopted faithfulness metrics. In particular, we introduce two assessment dimensions, namely diagnosticity and time complexity. Diagnosticity refers to the degree to which the faithfulness metric favours relatively faithful interpretations over randomly generated ones, and time complexity is measured by the average number of model forward passes. According to the experimental results, we find that sufficiency and comprehensiveness metrics have higher diagnosticity and lower time complexity than the other faithfulness metric

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes