MLAILGOct 17, 2017

Distill-and-Compare: Auditing Black-Box Models Using Transparent Model Distillation

arXiv:1710.06169v4208 citations
Originality Incremental advance
AI Analysis

This addresses the need for auditing proprietary models in high-stakes domains like criminal justice and lending, offering a practical method for stakeholders, though it is incremental in building on existing distillation techniques.

The paper tackles the problem of auditing opaque black-box risk scoring models by proposing Distill-and-Compare, which uses transparent model distillation and comparison to gain insights without probing the API, and demonstrates it on four public datasets, finding that the ProPublica data is likely missing key features used in COMPAS.

Black-box risk scoring models permeate our lives, yet are typically proprietary or opaque. We propose Distill-and-Compare, a model distillation and comparison approach to audit such models. To gain insight into black-box models, we treat them as teachers, training transparent student models to mimic the risk scores assigned by black-box models. We compare the student model trained with distillation to a second un-distilled transparent model trained on ground-truth outcomes, and use differences between the two models to gain insight into the black-box model. Our approach can be applied in a realistic setting, without probing the black-box model API. We demonstrate the approach on four public data sets: COMPAS, Stop-and-Frisk, Chicago Police, and Lending Club. We also propose a statistical test to determine if a data set is missing key features used to train the black-box model. Our test finds that the ProPublica data is likely missing key feature(s) used in COMPAS.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes