LGMLJun 23, 2022

Is your model predicting the past?

arXiv:2206.11673v212 citationsh-index: 54
Originality Incremental advance
AI Analysis

This work addresses the issue of model interpretability and fairness for practitioners and researchers in machine learning, offering a novel auditing tool, though it is incremental in building on existing statistical concepts.

The paper tackles the problem of distinguishing whether a machine learning model predicts future outcomes or merely recites past patterns, proposing a family of statistical tests called backward baselines to audit models as black boxes using background variables and predictions.

When does a machine learning model predict the future of individuals and when does it recite patterns that predate the individuals? In this work, we propose a distinction between these two pathways of prediction, supported by theoretical, empirical, and normative arguments. At the center of our proposal is a family of simple and efficient statistical tests, called backward baselines, that demonstrate if, and to what extent, a model recounts the past. Our statistical theory provides guidance for interpreting backward baselines, establishing equivalences between different baselines and familiar statistical concepts. Concretely, we derive a meaningful backward baseline for auditing a prediction system as a black box, given only background variables and the system's predictions. Empirically, we evaluate the framework on different prediction tasks derived from longitudinal panel surveys, demonstrating the ease and effectiveness of incorporating backward baselines into the practice of machine learning.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes