CLAICYHCLGOct 9, 2020

Evaluating and Characterizing Human Rationales

arXiv:2010.04736v11013 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of reliable rationale evaluation in AI interpretability, offering actionable suggestions for researchers, but it is incremental as it builds on existing evaluation approaches.

The paper tackled the problem of evaluating machine-generated rationales by analyzing how human rationales perform with automated metrics, finding they often do not perform well, and proposed improved metrics and methods like fidelity curves to better characterize rationale quality.

Two main approaches for evaluating the quality of machine-generated rationales are: 1) using human rationales as a gold standard; and 2) automated metrics based on how rationales affect model behavior. An open question, however, is how human rationales fare with these automatic metrics. Analyzing a variety of datasets and models, we find that human rationales do not necessarily perform well on these metrics. To unpack this finding, we propose improved metrics to account for model-dependent baseline performance. We then propose two methods to further characterize rationale quality, one based on model retraining and one on using "fidelity curves" to reveal properties such as irrelevance and redundancy. Our work leads to actionable suggestions for evaluating and characterizing rationales.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes