LG MLSep 17, 2019

A Distributed Fair Machine Learning Framework with Private Demographic Data Protection

arXiv:1909.08081v17.728 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the challenge of balancing fairness and privacy in machine learning for applications under regulations like GDPR, though it is incremental as it builds on existing fair learning methods.

The paper tackles the problem of fair machine learning without direct access to private demographic data by proposing a distributed framework, and it shows that the proposed methods consistently outperform existing counterparts in both fairness and accuracy across three real-world datasets.

Fair machine learning has become a significant research topic with broad societal impact. However, most fair learning methods require direct access to personal demographic data, which is increasingly restricted to use for protecting user privacy (e.g. by the EU General Data Protection Regulation). In this paper, we propose a distributed fair learning framework for protecting the privacy of demographic data. We assume this data is privately held by a third party, which can communicate with the data center (responsible for model development) without revealing the demographic information. We propose a principled approach to design fair learning methods under this framework, exemplify four methods and show they consistently outperform their existing counterparts in both fairness and accuracy across three real-world data sets. We theoretically analyze the framework, and prove it can learn models with high fairness or high accuracy, with their trade-offs balanced by a threshold variable.

View on arXiv PDF Code

Similar