LGSep 8, 2021

Diagnostics-Guided Explanation Generation

arXiv:2109.03756v110 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of generating high-quality explanations for AI models in domains lacking human annotations, offering a method to improve interpretability and performance in complex reasoning tasks.

The paper tackled the problem of generating explanations for machine learning models without human annotations by directly optimizing for diagnostic properties like Faithfulness, Data Consistency, and Confidence Indication. The result was improved explanation quality, better agreement with human rationales, and enhanced downstream task performance on three complex reasoning tasks.

Explanations shed light on a machine learning model's rationales and can aid in identifying deficiencies in its reasoning process. Explanation generation models are typically trained in a supervised way given human explanations. When such annotations are not available, explanations are often selected as those portions of the input that maximise a downstream task's performance, which corresponds to optimising an explanation's Faithfulness to a given model. Faithfulness is one of several so-called diagnostic properties, which prior work has identified as useful for gauging the quality of an explanation without requiring annotations. Other diagnostic properties are Data Consistency, which measures how similar explanations are for similar input instances, and Confidence Indication, which shows whether the explanation reflects the confidence of the model. In this work, we show how to directly optimise for these diagnostic properties when training a model to generate sentence-level explanations, which markedly improves explanation quality, agreement with human rationales, and downstream task performance on three complex reasoning tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes