Fair Sufficient Representation Learning
This work addresses fairness in machine learning for applications like healthcare and text analysis, but it is incremental as it builds on existing fair representation learning methods.
The paper tackles the problem of balancing sufficiency and fairness in representation learning by introducing the Fair Sufficient Representation Learning (FSRL) method, which uses a convex combination of objectives and distance covariance to achieve a superior trade-off between fairness and accuracy on healthcare and text datasets.
The main objective of fair statistical modeling and machine learning is to minimize or eliminate biases that may arise from the data or the model itself, ensuring that predictions and decisions are not unjustly influenced by sensitive attributes such as race, gender, age, or other protected characteristics. In this paper, we introduce a Fair Sufficient Representation Learning (FSRL) method that balances sufficiency and fairness. Sufficiency ensures that the representation should capture all necessary information about the target variables, while fairness requires that the learned representation remains independent of sensitive attributes. FSRL is based on a convex combination of an objective function for learning a sufficient representation and an objective function that ensures fairness. Our approach manages fairness and sufficiency at the representation level, offering a novel perspective on fair representation learning. We implement this method using distance covariance, which is effective for characterizing independence between random variables. We establish the convergence properties of the learned representations. Experiments conducted on healthcase and text datasets with diverse structures demonstrate that FSRL achieves a superior trade-off between fairness and accuracy compared to existing approaches.