LGHCApr 23, 2023

Improved Churn Causal Analysis Through Restrained High-Dimensional Feature Space Effects in Financial Institutions

arXiv:2304.11503v112 citationsh-index: 31
Originality Synthesis-oriented
AI Analysis

This work addresses churn prediction for financial institutions, offering incremental improvements through a hybrid method on real-world data.

This study tackled the problem of customer churn prediction in financial institutions by developing a framework that combines algorithms like SMOTE, ensemble ANN, and Bayesian networks to handle high-dimensional data, achieving 86% accuracy with random forest and their ensemble ANN model. It identified specific confounding variables, such as super guarantee contribution and account balance, as causes of churn with high belief.

Customer churn describes terminating a relationship with a business or reducing customer engagement over a specific period. Customer acquisition cost can be five to six times that of customer retention, hence investing in customers with churn risk is wise. Causal analysis of the churn model can predict whether a customer will churn in the foreseeable future and identify effects and possible causes for churn. In general, this study presents a conceptual framework to discover the confounding features that correlate with independent variables and are causally related to those dependent variables that impact churn. We combine different algorithms including the SMOTE, ensemble ANN, and Bayesian networks to address churn prediction problems on a massive and high-dimensional finance data that is usually generated in financial institutions due to employing interval-based features used in Customer Relationship Management systems. The effects of the curse and blessing of dimensionality assessed by utilising the Recursive Feature Elimination method to overcome the high dimension feature space problem. Moreover, a causal discovery performed to find possible interpretation methods to describe cause probabilities that lead to customer churn. Evaluation metrics on validation data confirm the random forest and our ensemble ANN model, with %86 accuracy, outperformed other approaches. Causal analysis results confirm that some independent causal variables representing the level of super guarantee contribution, account growth, and account balance amount were identified as confounding variables that cause customer churn with a high degree of belief. This article provides a real-world customer churn analysis from current status inference to future directions in local superannuation funds.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes