Fairness in Credit Scoring: Assessment, Implementation and Profit Implications
This addresses fairness in credit scoring for financial institutions, offering practical implementation insights, though it is incremental as it builds on existing fair ML research.
The paper tackled the problem of applying fairness criteria in credit scoring by assessing statistical fairness measures, cataloging algorithmic options, and empirically comparing fairness processors with real-world data, finding that multiple fairness criteria can be satisfied simultaneously and fair in-processors balance profit and fairness effectively.
The rise of algorithmic decision-making has spawned much research on fair machine learning (ML). Financial institutions use ML for building risk scorecards that support a range of credit-related decisions. Yet, the literature on fair ML in credit scoring is scarce. The paper makes three contributions. First, we revisit statistical fairness criteria and examine their adequacy for credit scoring. Second, we catalog algorithmic options for incorporating fairness goals in the ML model development pipeline. Last, we empirically compare different fairness processors in a profit-oriented credit scoring context using real-world data. The empirical results substantiate the evaluation of fairness measures, identify suitable options to implement fair credit scoring, and clarify the profit-fairness trade-off in lending decisions. We find that multiple fairness criteria can be approximately satisfied at once and recommend separation as a proper criterion for measuring the fairness of a scorecard. We also find fair in-processors to deliver a good balance between profit and fairness and show that algorithmic discrimination can be reduced to a reasonable level at a relatively low cost. The codes corresponding to the paper are available on GitHub.