LGMay 18, 2015

Ensemble of Example-Dependent Cost-Sensitive Decision Trees

Alejandro Correa Bahnsen, Djamila Aouada, Bjorn Ottersten

arXiv:1505.04637v12.124 citations

Originality Incremental advance

AI Analysis

This work addresses cost-sensitive classification for applications like fraud detection and credit scoring, but it is incremental as it builds on existing cost-sensitive decision trees with ensemble improvements.

The authors tackled the problem of example-dependent cost-sensitive classification, where misclassification costs vary per example, by proposing an ensemble framework of cost-sensitive decision trees and two new combination methods, achieving higher savings across five real-world databases compared to state-of-the-art techniques.

Several real-world classification problems are example-dependent cost-sensitive in nature, where the costs due to misclassification vary between examples and not only within classes. However, standard classification methods do not take these costs into account, and assume a constant cost of misclassification errors. In previous works, some methods that take into account the financial costs into the training of different algorithms have been proposed, with the example-dependent cost-sensitive decision tree algorithm being the one that gives the highest savings. In this paper we propose a new framework of ensembles of example-dependent cost-sensitive decision-trees. The framework consists in creating different example-dependent cost-sensitive decision trees on random subsamples of the training set, and then combining them using three different combination approaches. Moreover, we propose two new cost-sensitive combination approaches; cost-sensitive weighted voting and cost-sensitive stacking, the latter being based on the cost-sensitive logistic regression method. Finally, using five different databases, from four real-world applications: credit card fraud detection, churn modeling, credit scoring and direct marketing, we evaluate the proposed method against state-of-the-art example-dependent cost-sensitive techniques, namely, cost-proportionate sampling, Bayes minimum risk and cost-sensitive decision trees. The results show that the proposed algorithms have better results for all databases, in the sense of higher savings.

View on arXiv PDF

Similar