EMLGJul 30, 2019

Predicting credit default probabilities using machine learning techniques in the face of unequal class distributions

arXiv:1907.12996v1
Originality Synthesis-oriented
AI Analysis

It addresses credit scoring for financial institutions, but it is incremental as it focuses on benchmarking existing methods.

This study benchmarked 23 statistical and machine learning methods for predicting credit default probabilities, finding that ensemble methods performed best and simple sampling strategies outperformed more complex ones in handling class imbalances across four datasets.

This study conducts a benchmarking study, comparing 23 different statistical and machine learning methods in a credit scoring application. In order to do so, the models' performance is evaluated over four different data sets in combination with five data sampling strategies to tackle existing class imbalances in the data. Six different performance measures are used to cover different aspects of predictive performance. The results indicate a strong superiority of ensemble methods and show that simple sampling strategies deliver better results than more sophisticated ones.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes