Advanced User Credit Risk Prediction Model using LightGBM, XGBoost and Tabnet with SMOTEENN
This work addresses credit risk prediction for banks, but it is incremental as it applies existing methods to a specific dataset with tuning and combination of techniques.
The study tackled credit risk prediction for bank credit card applicants by comparing machine learning models like LightGBM, XGBoost, and Tabnet with preprocessing techniques, finding that LightGBM combined with PCA and SMOTEENN achieved relatively outstanding performance in identifying high-quality customers.
Bank credit risk is a significant challenge in modern financial transactions, and the ability to identify qualified credit card holders among a large number of applicants is crucial for the profitability of a bank'sbank's credit card business. In the past, screening applicants'applicants' conditions often required a significant amount of manual labor, which was time-consuming and labor-intensive. Although the accuracy and reliability of previously used ML models have been continuously improving, the pursuit of more reliable and powerful AI intelligent models is undoubtedly the unremitting pursuit by major banks in the financial industry. In this study, we used a dataset of over 40,000 records provided by a commercial bank as the research object. We compared various dimensionality reduction techniques such as PCA and T-SNE for preprocessing high-dimensional datasets and performed in-depth adaptation and tuning of distributed models such as LightGBM and XGBoost, as well as deep models like Tabnet. After a series of research and processing, we obtained excellent research results by combining SMOTEENN with these techniques. The experiments demonstrated that LightGBM combined with PCA and SMOTEENN techniques can assist banks in accurately predicting potential high-quality customers, showing relatively outstanding performance compared to other models.