fairret: a Framework for Differentiable Fairness Regularization Terms
This work addresses the problem of integrating diverse fairness constraints into modern ML pipelines for practitioners, though it is incremental as it builds on existing fairness definitions.
The authors tackled the limited integration of fairness definitions with automatic differentiation in machine learning by introducing a framework of fairness regularization terms (fairrets) that quantify bias as modular objectives, showing minimal loss of predictive power compared to baselines.
Current fairness toolkits in machine learning only admit a limited range of fairness definitions and have seen little integration with automatic differentiation libraries, despite the central role these libraries play in modern machine learning pipelines. We introduce a framework of fairness regularization terms (fairrets) which quantify bias as modular, flexible objectives that are easily integrated in automatic differentiation pipelines. By employing a general definition of fairness in terms of linear-fractional statistics, a wide class of fairrets can be computed efficiently. Experiments show the behavior of their gradients and their utility in enforcing fairness with minimal loss of predictive power compared to baselines. Our contribution includes a PyTorch implementation of the fairret framework.