LGCRCYSep 8, 2022

Black-Box Audits for Group Distribution Shifts

arXiv:2209.03620v16 citationsh-index: 34
Originality Incremental advance
AI Analysis

This enables external entities like researchers and journalists to audit proprietary models for underrepresentation without collaboration, addressing fairness concerns in AI systems.

The paper tackles the problem of detecting performance disparities across demographic groups due to distribution shifts in proprietary models, and demonstrates that a black-box auditing method achieves 80-100% AUC-ROC in detecting such shifts.

When a model informs decisions about people, distribution shifts can create undue disparities. However, it is hard for external entities to check for distribution shift, as the model and its training set are often proprietary. In this paper, we introduce and study a black-box auditing method to detect cases of distribution shift that lead to a performance disparity of the model across demographic groups. By extending techniques used in membership and property inference attacks -- which are designed to expose private information from learned models -- we demonstrate that an external auditor can gain the information needed to identify these distribution shifts solely by querying the model. Our experimental results on real-world datasets show that this approach is effective, achieving 80--100% AUC-ROC in detecting shifts involving the underrepresentation of a demographic group in the training set. Researchers and investigative journalists can use our tools to perform non-collaborative audits of proprietary models and expose cases of underrepresentation in the training datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes