Towards a Measure of Individual Fairness for Deep Learning
This addresses fairness issues in AI for affected individuals, but it is incremental as it builds on existing fairness concepts.
The paper tackles the problem of bias in deep learning predictions by proposing a novel measure called prediction sensitivity, which approximates how much a prediction depends on a protected attribute, and shows preliminary empirical results suggesting its effectiveness.
Deep learning has produced big advances in artificial intelligence, but trained neural networks often reflect and amplify bias in their training data, and thus produce unfair predictions. We propose a novel measure of individual fairness, called prediction sensitivity, that approximates the extent to which a particular prediction is dependent on a protected attribute. We show how to compute prediction sensitivity using standard automatic differentiation capabilities present in modern deep learning frameworks, and present preliminary empirical results suggesting that prediction sensitivity may be effective for measuring bias in individual predictions.