GNNov 5, 2021
Predicting Mortality from Credit ReportsGiacomo De Giorgi, Matthew Harding, Gabriel Vasconcelos
Data on hundreds of variables related to individual consumer finance behavior (such as credit card and loan activity) is routinely collected in many countries and plays an important role in lending decisions. We postulate that the detailed nature of this data may be used to predict outcomes in seemingly unrelated domains such as individual health. We build a series of machine learning models to demonstrate that credit report data can be used to predict individual mortality. Variable groups related to credit cards and various loans, mostly unsecured loans, are shown to carry significant predictive power. Lags of these variables are also significant thus indicating that dynamics also matters. Improved mortality predictions based on consumer finance data can have important economic implications in insurance markets but may also raise privacy concerns.
PMFeb 5, 2020
Sharpe Ratio Analysis in High Dimensions: Residual-Based Nodewise Regression in Factor ModelsMehmet Caner, Marcelo Medeiros, Gabriel Vasconcelos
We provide a new theory for nodewise regression when the residuals from a fitted factor model are used. We apply our results to the analysis of the consistency of Sharpe ratio estimators when there are many assets in a portfolio. We allow for an increasing number of assets as well as time observations of the portfolio. Since the nodewise regression is not feasible due to the unknown nature of idiosyncratic errors, we provide a feasible-residual-based nodewise regression to estimate the precision matrix of errors which is consistent even when number of assets, p, exceeds the time span of the portfolio, n. In another new development, we also show that the precision matrix of returns can be estimated consistently, even with an increasing number of factors and p>n. We show that: (1) with p>n, the Sharpe ratio estimators are consistent in global minimum-variance and mean-variance portfolios; and (2) with p>n, the maximum Sharpe ratio estimator is consistent when the portfolio weights sum to one; and (3) with p<<n, the maximum-out-of-sample Sharpe ratio estimator is consistent.
MLAug 10, 2018
BooST: Boosting Smooth Trees for Partial Effect Estimation in Nonlinear RegressionsYuri Fonseca, Marcelo Medeiros, Gabriel Vasconcelos et al.
In this paper, we introduce a new machine learning (ML) model for nonlinear regression called the Boosted Smooth Transition Regression Trees (BooST), which is a combination of boosting algorithms with smooth transition regression trees. The main advantage of the BooST model is the estimation of the derivatives (partial effects) of very general nonlinear models. Therefore, the model can provide more interpretation about the mapping between the covariates and the dependent variable than other tree-based models, such as Random Forests. We present several examples with both simulated and real data.