LG AI ITFeb 3, 2021

Impact of Data Processing on Fairness in Supervised Learning

Sajad Khodadadian, AmirEmad Ghassami, Negar Kiyavash

arXiv:2102.01867v18.49 citations

Originality Incremental advance

AI Analysis

This research addresses the critical problem of reducing discrimination in data-driven decision-making for practitioners and researchers in fairness-aware machine learning, offering an incremental improvement in understanding and methodology.

This paper investigates the impact of data processing on fairness in supervised learning, focusing on pre-processing and post-processing methods. It proposes a pre-processing module based on convex optimization, demonstrating a fundamental lower bound on attainable discrimination for a given outcome distortion. The study also shows that pre-processing can outperform post-processing under mild conditions.

We study the impact of pre and post processing for reducing discrimination in data-driven decision makers. We first analyze the fundamental trade-off between fairness and accuracy in a pre-processing approach, and propose a design for a pre-processing module based on a convex optimization program, which can be added before the original classifier. This leads to a fundamental lower bound on attainable discrimination, given any acceptable distortion in the outcome. Furthermore, we reformulate an existing post-processing method in terms of our accuracy and fairness measures, which allows comparing post-processing and pre-processing approaches. We show that under some mild conditions, pre-processing outperforms post-processing. Finally, we show that by appropriate choice of the discrimination measure, the optimization problem for both pre and post processing approaches will reduce to a linear program and hence can be solved efficiently.

View on arXiv PDF

Similar