48.4LGMay 18
Foundation Models for Credit Risk Prediction: A Game Changer?Bart Baesens, Andreas Goethals, Stefan Lessmann et al.
Predictive models play a pivotal role in credit risk management, guiding critical decisions through accurate estimation of default probabilities and losses. Extensive research has introduced new modeling techniques, complemented by large-scale benchmarking studies consolidating the state-of-the-art. Today, quasi-standards such as gradient-boosting models paired with SHAP explainers have emerged, yet continuous improvement of risk models remains a top priority. Concurrently, rapid advancements in AI, most notably large language models, have disrupted predictive modeling paradigms. Foundation models, pretrained on extensive datasets from diverse domains, have demonstrated remarkable performance by leveraging prior knowledge. While prevalent in natural language processing and computer vision, foundation models for tabular data have only recently emerged. We conjecture that pretraining on out-of-domain data is particularly beneficial in small-data settings, such as SME lending or specialized corporate portfolios, and may help address longstanding challenges including low default portfolios and class imbalance. This paper benchmarks recently proposed tabular foundation models against a broad set of competitors, including established and advanced machine learning techniques, across two core tasks: PD and LGD modeling. Our evaluation encompasses various datasets, performance indicators, and experimental conditions. We find that tabular foundation models generally perform best across datasets and tasks. Moreover, they offer significant improvement in predictive performance as dataset size shrinks. These results are remarkable given that the models are tested out-of-the-box, without hyperparameter tuning, ensuring ease of use and mitigating computational costs.
LGDec 2, 2021Code
Deep residential representations: Using unsupervised learning to unlock elevation data for geo-demographic predictionMatthew Stevenson, Christophe Mues, Cristián Bravo
LiDAR (short for "Light Detection And Ranging" or "Laser Imaging, Detection, And Ranging") technology can be used to provide detailed three-dimensional elevation maps of urban and rural landscapes. To date, airborne LiDAR imaging has been predominantly confined to the environmental and archaeological domains. However, the geographically granular and open-source nature of this data also lends itself to an array of societal, organizational and business applications where geo-demographic type data is utilised. Arguably, the complexity involved in processing this multi-dimensional data has thus far restricted its broader adoption. In this paper, we propose a series of convenient task-agnostic tile elevation embeddings to address this challenge, using recent advances from unsupervised Deep Learning. We test the potential of our embeddings by predicting seven English indices of deprivation (2019) for small geographies in the Greater London area. These indices cover a range of socio-economic outcomes and serve as a proxy for a wide variety of downstream tasks to which the embeddings can be applied. We consider the suitability of this data not just on its own but also as an auxiliary source of data in combination with demographic features, thus providing a realistic use case for the embeddings. Having trialled various model/embedding configurations, we find that our best performing embeddings lead to Root-Mean-Squared-Error (RMSE) improvements of up to 21% over using standard demographic features alone. We also demonstrate how our embedding pipeline, using Deep Learning combined with K-means clustering, produces coherent tile segments which allow the latent embedding features to be interpreted.
PMJul 22, 2024
Large-scale Time-Varying Portfolio Optimisation using Graph Attention NetworksKamesh Korangi, Christophe Mues, Cristián Bravo
Apart from assessing individual asset performance, investors in financial markets also need to consider how a set of firms performs collectively as a portfolio. Whereas traditional Markowitz-based mean-variance portfolios are widespread, network-based optimisation techniques offer a more flexible tool to capture complex interdependencies between asset values. However, most of the existing studies do not contain firms at risk of default and remove any firms that drop off indices over a certain time. This is the first study to also incorporate such firms in portfolio optimisation on a large scale. We propose and empirically test a novel method that leverages Graph Attention networks (GATs), a subclass of Graph Neural Networks (GNNs). GNNs, as deep learning-based models, can exploit network data to uncover nonlinear relationships. Their ability to handle high-dimensional data and accommodate customised layers for specific purposes makes them appealing for large-scale problems such as mid- and small-cap portfolio optimisation. This study utilises 30 years of data on mid-cap firms, creating graphs of firms using distance correlation and the Triangulated Maximally Filtered Graph approach. These graphs are the inputs to a GAT model incorporating weight and allocation constraints and a loss function derived from the Sharpe ratio, thus focusing on maximising portfolio risk-adjusted returns. This new model is benchmarked against a network characteristic-based portfolio, a mean variance-based portfolio, and an equal-weighted portfolio. The results show that the portfolio produced by the GAT-based model outperforms all benchmarks and is consistently superior to other strategies over a long period, while also being informative of market dynamics.
GNFeb 1, 2024
Attention-based Dynamic Multilayer Graph Neural Networks for Loan Default PredictionSahab Zandi, Kamesh Korangi, María Óskarsdóttir et al.
Whereas traditional credit scoring tends to employ only individual borrower- or loan-level predictors, it has been acknowledged for some time that connections between borrowers may result in default risk propagating over a network. In this paper, we present a model for credit risk assessment leveraging a dynamic multilayer network built from a Graph Neural Network and a Recurrent Neural Network, each layer reflecting a different source of network connection. We test our methodology in a behavioural credit scoring context using a dataset provided by U.S. mortgage financier Freddie Mac, in which different types of connections arise from the geographical location of the borrower and their choice of mortgage provider. The proposed model considers both types of connections and the evolution of these connections over time. We enhance the model by using a custom attention mechanism that weights the different time snapshots according to their importance. After testing multiple configurations, a model with GAT, LSTM, and the attention mechanism provides the best results. Empirical results demonstrate that, when it comes to predicting probability of default for the borrowers, our proposed model brings both better results and novel insights for the analysis of the importance of connections and timestamps, compared to traditional methods.
LGFeb 2, 2024
A Distributionally Robust Optimisation Approach to Fair Credit ScoringPablo Casas, Christophe Mues, Huan Yu
Credit scoring has been catalogued by the European Commission and the Executive Office of the US President as a high-risk classification task, a key concern being the potential harms of making loan approval decisions based on models that would be biased against certain groups. To address this concern, recent credit scoring research has considered a range of fairness-enhancing techniques put forward by the machine learning community to reduce bias and unfair treatment in classification systems. While the definition of fairness or the approach they follow to impose it may vary, most of these techniques, however, disregard the robustness of the results. This can create situations where unfair treatment is effectively corrected in the training set, but when producing out-of-sample classifications, unfair treatment is incurred again. Instead, in this paper, we will investigate how to apply Distributionally Robust Optimisation (DRO) methods to credit scoring, thereby empirically evaluating how they perform in terms of fairness, ability to classify correctly, and the robustness of the solution against changes in the marginal proportions. In so doing, we find DRO methods to provide a substantial improvement in terms of fairness, with almost no loss in performance. These results thus indicate that DRO can improve fairness in credit scoring, provided that further advances are made in efficiently implementing these systems. In addition, our analysis suggests that many of the commonly used fairness metrics are unsuitable for a credit scoring setting, as they depend on the choice of classification threshold.
GNOct 10, 2025
A Multimodal Approach to SME Credit Scoring Integrating Transaction and Ownership NetworksSahab Zandi, Kamesh Korangi, Juan C. Moreno-Paredes et al.
Small and Medium-sized Enterprises (SMEs) are known to play a vital role in economic growth, employment, and innovation. However, they tend to face significant challenges in accessing credit due to limited financial histories, collateral constraints, and exposure to macroeconomic shocks. These challenges make an accurate credit risk assessment by lenders crucial, particularly since SMEs frequently operate within interconnected firm networks through which default risk can propagate. This paper presents and tests a novel approach for modelling the risk of SME credit, using a unique large data set of SME loans provided by a prominent financial institution. Specifically, our approach employs Graph Neural Networks to predict SME default using multilayer network data derived from common ownership and financial transactions between firms. We show that combining this information with traditional structured data not only improves application scoring performance, but also explicitly models contagion risk between companies. Further analysis shows how the directionality and intensity of these connections influence financial risk contagion, offering a deeper understanding of the underlying processes. Our findings highlight the predictive power of network data, as well as the role of supply chain networks in exposing SMEs to correlated default risk.
GNNov 18, 2021
A transformer-based model for default prediction in mid-cap corporate marketsKamesh Korangi, Christophe Mues, Cristián Bravo
In this paper, we study mid-cap companies, i.e. publicly traded companies with less than US $10 billion in market capitalisation. Using a large dataset of US mid-cap companies observed over 30 years, we look to predict the default probability term structure over the medium term and understand which data sources (i.e. fundamental, market or pricing data) contribute most to the default risk. Whereas existing methods typically require that data from different time periods are first aggregated and turned into cross-sectional features, we frame the problem as a multi-label time-series classification problem. We adapt transformer models, a state-of-the-art deep learning model emanating from the natural language processing domain, to the credit risk modelling setting. We also interpret the predictions of these models using attention heat maps. To optimise the model further, we present a custom loss function for multi-label classification and a novel multi-channel architecture with differential training that gives the model the ability to use all input data efficiently. Our results show the proposed deep learning architecture's superior performance, resulting in a 13% improvement in AUC (Area Under the receiver operating characteristic Curve) over traditional models. We also demonstrate how to produce an importance ranking for the different data sources and the temporal relationships using a Shapley approach specific to these models.
LGMar 19, 2020
The value of text for small business default prediction: A deep learning approachMatthew Stevenson, Christophe Mues, Cristián Bravo
Compared to consumer lending, Micro, Small and Medium Enterprise (mSME) credit risk modelling is particularly challenging, as, often, the same sources of information are not available. Therefore, it is standard policy for a loan officer to provide a textual loan assessment to mitigate limited data availability. In turn, this statement is analysed by a credit expert alongside any available standard credit data. In our paper, we exploit recent advances from the field of Deep Learning and Natural Language Processing (NLP), including the BERT (Bidirectional Encoder Representations from Transformers) model, to extract information from 60 000 textual assessments provided by a lender. We consider the performance in terms of the AUC (Area Under the receiver operating characteristic Curve) and Brier Score metrics and find that the text alone is surprisingly effective for predicting default. However, when combined with traditional data, it yields no additional predictive capability, with performance dependent on the text's length. Our proposed deep learning model does, however, appear to be robust to the quality of the text and therefore suitable for partly automating the mSME lending process. We also demonstrate how the content of loan assessments influences performance, leading us to a series of recommendations on a new strategy for collecting future mSME loan assessments.