LGAIHCFeb 2, 2021

Evaluating the Interpretability of Generative Models by Interactive Reconstruction

arXiv:2102.01264v155 citations
Originality Incremental advance
AI Analysis

This addresses the lack of consensus in interpretability measurement for generative models, particularly in representation learning, by providing a human-grounded evaluation method.

The authors tackled the problem of measuring interpretability in generative models by introducing an interactive reconstruction task, finding it reliably differentiates between entangled and disentangled models on synthetic datasets and distinguishes representation learning methods on real data.

For machine learning models to be most useful in numerous sociotechnical systems, many have argued that they must be human-interpretable. However, despite increasing interest in interpretability, there remains no firm consensus on how to measure it. This is especially true in representation learning, where interpretability research has focused on "disentanglement" measures only applicable to synthetic datasets and not grounded in human factors. We introduce a task to quantify the human-interpretability of generative model representations, where users interactively modify representations to reconstruct target instances. On synthetic datasets, we find performance on this task much more reliably differentiates entangled and disentangled models than baseline approaches. On a real dataset, we find it differentiates between representation learning methods widely believed but never shown to produce more or less interpretable models. In both cases, we ran small-scale think-aloud studies and large-scale experiments on Amazon Mechanical Turk to confirm that our qualitative and quantitative results agreed.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes