LG MLOct 25, 2018

Superensemble Classifier for Improving Predictions in Imbalanced Datasets

Tanujit Chakraborty, Ashis Kumar Chakraborty

arXiv:1810.11317v12.97 citations

Originality Incremental advance

AI Analysis

This addresses the problem of poor minority class predictions in imbalanced datasets for machine learning practitioners, though it appears incremental as it combines existing techniques.

The paper tackles imbalanced classification problems by proposing a superensemble classifier that combines Hellinger distance decision trees with radial basis function networks, showing effectiveness and competitiveness with state-of-the-art models on real-life datasets.

Learning from an imbalanced dataset is a tricky proposition. Because these datasets are biased towards one class, most existing classifiers tend not to perform well on minority class examples. Conventional classifiers usually aim to optimize the overall accuracy without considering the relative distribution of each class. This article presents a superensemble classifier, to tackle and improve predictions in imbalanced classification problems, that maps Hellinger distance decision trees (HDDT) into radial basis function network (RBFN) framework. Regularity conditions for universal consistency and the idea of parameter optimization of the proposed model are provided. The proposed distribution-free model can be applied for feature selection cum imbalanced classification problems. We have also provided enough numerical evidence using various real-life data sets to assess the performance of the proposed model. Its effectiveness and competitiveness with respect to different state-of-the-art models are shown.

View on arXiv PDF

Similar