LGCYJul 9, 2023

On The Impact of Machine Learning Randomness on Group Fairness

arXiv:2307.04138v140 citationsh-index: 44
Originality Incremental advance
AI Analysis

This addresses reliability issues in fairness evaluation for under-represented groups, though it is incremental as it builds on existing fairness measures.

The paper tackled the problem of high variance in group fairness measures in machine learning by identifying the stochasticity of data order during training as the dominant source, and showed that changing data order for a single epoch can control group-level accuracy with high efficiency and negligible impact on overall performance.

Statistical measures for group fairness in machine learning reflect the gap in performance of algorithms across different groups. These measures, however, exhibit a high variance between different training instances, which makes them unreliable for empirical evaluation of fairness. What causes this high variance? We investigate the impact on group fairness of different sources of randomness in training neural networks. We show that the variance in group fairness measures is rooted in the high volatility of the learning process on under-represented groups. Further, we recognize the dominant source of randomness as the stochasticity of data order during training. Based on these findings, we show how one can control group-level accuracy (i.e., model fairness), with high efficiency and negligible impact on the model's overall performance, by simply changing the data order for a single epoch.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes