Optimising Equal Opportunity Fairness in Model Training
This work addresses fairness in machine learning for real-world applications, offering a method to mitigate societal biases in models, though it is incremental as it builds on existing debiasing techniques.
The paper tackled the problem of bias in trained models by proposing two novel training objectives that directly optimize for equal opportunity fairness, showing effectiveness in reducing bias while maintaining high performance on two classification tasks.
Real-world datasets often encode stereotypes and societal biases. Such biases can be implicitly captured by trained models, leading to biased predictions and exacerbating existing societal preconceptions. Existing debiasing methods, such as adversarial training and removing protected information from representations, have been shown to reduce bias. However, a disconnect between fairness criteria and training objectives makes it difficult to reason theoretically about the effectiveness of different techniques. In this work, we propose two novel training objectives which directly optimise for the widely-used criterion of {\it equal opportunity}, and show that they are effective in reducing bias while maintaining high performance over two classification tasks.