A Vertical Federated Learning Method for Interpretable Scorecard and Its Application in Credit Scoring
This work addresses credit scoring for financial institutions by enabling collaborative model training across agencies without sharing sensitive data, though it is incremental as it adapts existing federated learning to a traditional scorecard method.
The authors tackled the problem of training interpretable credit scoring models while preserving data privacy by proposing a vertical federated learning method based on logistic regression with bounded constraints (FL-LRBC), which improved performance with significant gains in AUC and KS statistics due to data enrichment.
With the success of big data and artificial intelligence in many fields, the applications of big data driven models are expected in financial risk management especially credit scoring and rating. Under the premise of data privacy protection, we propose a projected gradient-based method in the vertical federated learning framework for the traditional scorecard, which is based on logistic regression with bounded constraints, namely FL-LRBC. The latter enables multiple agencies to jointly train an optimized scorecard model in a single training session. It leads to the formation of the model with positive coefficients, while the time-consuming parameter-tuning process can be avoided. Moreover, the performance in terms of both AUC and the Kolmogorov-Smirnov (KS) statistics is significantly improved due to data enrichment using FL-LRBC. At present, FL-LRBC has already been applied to credit business in a China nation-wide financial holdings group.