LGMLMay 21, 2025

Direct Preference Optimization for Adaptive Concept-based Explanations

arXiv:2505.15626v2h-index: 25
Originality Incremental advance
AI Analysis

This addresses the need for personalized explanations in fields like healthcare, where different audiences require tailored communication, though it is incremental as it builds on existing concept-based methods.

The paper tackles the problem of concept-based explanation methods ignoring listener preferences by introducing a listener-adaptive approach using direct preference optimization, which aligns speakers with simulated listeners on image classification datasets and improves participant classification accuracy in a user study.

Concept-based explanation methods aim at making machine learning models more transparent by finding the most important semantic features of an input (e.g., colors, patterns, shapes) for a given prediction task. However, these methods generally ignore the communicative context of explanations, such as the preferences of a listener. For example, medical doctors understand explanations in terms of clinical markers, but patients may not, needing a different vocabulary to rationalize the same diagnosis. We address this gap with listener-adaptive explanations grounded in principles of pragmatic reasoning and the rational speech act. We introduce an iterative training procedure based on direct preference optimization where a speaker learns to compose explanations that maximize communicative utility for a listener. Our approach only needs access to pairwise preferences, which can be collected from human feedback, making it particularly relevant in real-world scenarios where a model of the listener may not be available. We demonstrate that our method is able to align speakers with the preferences of simulated listeners on image classification across three datasets, and further validate that pragmatic explanations generated with our method improve the classification accuracy of participants in a user study.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes