AIJul 12, 2017

A Formal Framework to Characterize Interpretability of Procedures

arXiv:1707.03886v119 citations
Originality Incremental advance
AI Analysis

This work addresses the need for a more rigorous and general definition of interpretability in machine learning, which is incremental as it builds on existing concepts but reframes them formally.

The paper tackles the problem of defining interpretability beyond human understanding by introducing a formal framework that characterizes interpretability relative to a target model, linking it to practical aspects like accuracy and robustness, and applies it to current state-of-the-art methods.

We provide a novel notion of what it means to be interpretable, looking past the usual association with human understanding. Our key insight is that interpretability is not an absolute concept and so we define it relative to a target model, which may or may not be a human. We define a framework that allows for comparing interpretable procedures by linking it to important practical aspects such as accuracy and robustness. We characterize many of the current state-of-the-art interpretable methods in our framework portraying its general applicability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes